Interaction-aware spatio-temporal pyramid attention networks for action classification W Hu, H Liu, Y Du, C Yuan, B Li, S Maybank IEEE Transactions on Pattern Analysis and Machine Intelligence 44 (10), 7010 …, 2022 | 114* | 2022 |
mplug-owl2: Revolutionizing multi-modal large language model with modality collaboration Q Ye, H Xu, J Ye, M Yan, A Hu, H Liu, Q Qian, J Zhang, F Huang Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 113 | 2024 |
TranSkeleton: Hierarchical spatial–temporal transformer for skeleton-based action recognition H Liu, Y Liu, Y Chen, C Yuan, B Li, W Hu IEEE Transactions on Circuits and Systems for Video Technology 33 (8), 4137-4148, 2023 | 20 | 2023 |
Learning Semantics-Grounded Vocabulary Representation for Video-Text Retrieval Y Shi, H Liu, H Xu, Z Ma, Q Ye, A Hu, M Yan, J Zhang, F Huang, C Yuan, ... Proceedings of the 31st ACM International Conference on Multimedia, 4460-4470, 2023 | 2 | 2023 |
Exploring motion information for distractor suppression in visual tracking K Liu, J Gao, H Liu, L Li, B Li, W Hu Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2022 | 1 | 2022 |
Semantics-enhanced Cross-modal Masked Image Modeling for Vision-Language Pre-training H Liu, Y Shi, H Xu, C Yuan, Q Ye, C Li, M Yan, J Zhang, F Huang, B Li, ... arXiv preprint arXiv:2403.00249, 2024 | | 2024 |
Unifying Latent and Lexicon Representations for Effective Video-Text Retrieval H Liu, Y Shi, H Xu, C Yuan, Q Ye, C Li, M Yan, J Zhang, F Huang, B Li, ... arXiv preprint arXiv:2402.16769, 2024 | | 2024 |