S2 Transformer for Image Captioning P Zeng, H Zhang, J Song, L Gao Proceedings of the Thirty-First International Joint Conference on Artificial …, 2022 | 49 | 2022 |
Video Question Answering with Prior Knowledge and Object-sensitive Learning P Zeng, H Zhang, L Gao, J Song, HT Shen IEEE Transactions on Image Processing, 5936-5948, 2022 | 31 | 2022 |
Memory-based augmentation network for video captioning S Jing, H Zhang, P Zeng, L Gao, J Song, HT Shen IEEE Transactions on Multimedia, 2023 | 16 | 2023 |
Learning visual question answering on controlled semantic noisy labels H Zhang, P Zeng, Y Hu, J Qian, J Song, L Gao Pattern Recognition 138, 109339, 2023 | 16 | 2023 |
Visual Commonsense-aware Representation Network for Video Captioning P Zeng, H Zhang, L Gao, X Li, J Qian, HT Shen IEEE Transactions on Neural Networks and Learning Systems, 2023 | 13 | 2023 |
A differentiable semantic metric approximation in probabilistic embedding for cross-modal retrieval H Li, J Song, L Gao, P Zeng, H Zhang, G Li Advances in Neural Information Processing Systems 35, 11934-11946, 2022 | 12 | 2022 |
Depth-aware sparse transformer for video-language learning H Zhang, L Gao, P Zeng, A Hanjalic, HT Shen Proceedings of the 31st ACM International Conference on Multimedia, 4778-4787, 2023 | 9 | 2023 |
SPT: Spatial pyramid transformer for image captioning H Zhang, P Zeng, L Gao, X Lyu, J Song, HT Shen IEEE Transactions on Circuits and Systems for Video Technology, 2023 | 7 | 2023 |
You should know more: Learning external knowledge for visual dialog L Zhao, H Zhang, X Li, S Yang, Y Song Neurocomputing 488, 54-65, 2022 | 3 | 2022 |
UMP: Unified Modality-aware Prompt Tuning for Text-Video Retrieval H Zhang, P Zeng, L Gao, J Song, HT Shen IEEE Transactions on Circuits and Systems for Video Technology, 2024 | 1 | 2024 |
Text-Video Retrieval with Global-Local Semantic Consistent Learning H Zhang, P Zeng, L Gao, J Song, Y Duan, X Lyu, H Shen arXiv preprint arXiv:2405.12710, 2024 | 1 | 2024 |
MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct R Luo, H Zhang, L Chen, TE Lin, X Liu, Y Wu, M Yang, M Wang, P Zeng, ... arXiv preprint arXiv:2409.05840, 2024 | | 2024 |
Pedestrian Attributes Recognition for UAV-Human H Ni, P Lai, Y Li, P Zeng, H Zhang, J Song 2024 IEEE International Conference on Multimedia and Expo Workshops (ICMEW), 1-5, 2024 | | 2024 |
MPT: Multi-grained Prompt Tuning for Text-Video Retrieval H Zhang, P Zeng, L Gao, J Song, HT Shen ACM Multimedia 2024, 2024 | | 2024 |