Efficientvit: Lightweight multi-scale attention for high-resolution dense prediction H Cai, J Li, M Hu, C Gan, S Han Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | 48* | 2023 |
Multiply: A multisensory object-centric embodied large language model in 3d world Y Hong, Z Zheng, P Chen, Y Wang, J Li, C Gan Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 6 | 2024 |
Constraint-aware and ranking-distilled token pruning for efficient transformer inference J Li, LL Zhang, J Xu, Y Wang, S Yan, Y Xia, Y Yang, T Cao, H Sun, ... Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and …, 2023 | 6 | 2023 |
Covlm: Composing visual entities and relationships in large language models via communicative decoding J Li, D Chen, Y Hong, Z Chen, P Chen, Y Shen, C Gan arXiv preprint arXiv:2311.03354, 2023 | 5 | 2023 |
FlexAttention for Efficient High-Resolution Vision-Language Models J Li, D Chen, T Cai, P Chen, Y Hong, Z Chen, Y Shen, C Gan arXiv preprint arXiv:2407.20228, 2024 | 1 | 2024 |