关注
Haonan Zhang
Haonan Zhang
其他姓名张 浩楠
UESTC |Alibaba TongYi Laboratory
在 std.uestc.edu.cn 的电子邮件经过验证 - 首页
标题
引用次数
引用次数
年份
S2 Transformer for Image Captioning
P Zeng, H Zhang, J Song, L Gao
Proceedings of the Thirty-First International Joint Conference on Artificial …, 2022
492022
Video Question Answering with Prior Knowledge and Object-sensitive Learning
P Zeng, H Zhang, L Gao, J Song, HT Shen
IEEE Transactions on Image Processing, 5936-5948, 2022
312022
Memory-based augmentation network for video captioning
S Jing, H Zhang, P Zeng, L Gao, J Song, HT Shen
IEEE Transactions on Multimedia, 2023
162023
Learning visual question answering on controlled semantic noisy labels
H Zhang, P Zeng, Y Hu, J Qian, J Song, L Gao
Pattern Recognition 138, 109339, 2023
162023
Visual Commonsense-aware Representation Network for Video Captioning
P Zeng, H Zhang, L Gao, X Li, J Qian, HT Shen
IEEE Transactions on Neural Networks and Learning Systems, 2023
132023
A differentiable semantic metric approximation in probabilistic embedding for cross-modal retrieval
H Li, J Song, L Gao, P Zeng, H Zhang, G Li
Advances in Neural Information Processing Systems 35, 11934-11946, 2022
122022
Depth-aware sparse transformer for video-language learning
H Zhang, L Gao, P Zeng, A Hanjalic, HT Shen
Proceedings of the 31st ACM International Conference on Multimedia, 4778-4787, 2023
92023
SPT: Spatial pyramid transformer for image captioning
H Zhang, P Zeng, L Gao, X Lyu, J Song, HT Shen
IEEE Transactions on Circuits and Systems for Video Technology, 2023
72023
You should know more: Learning external knowledge for visual dialog
L Zhao, H Zhang, X Li, S Yang, Y Song
Neurocomputing 488, 54-65, 2022
32022
UMP: Unified Modality-aware Prompt Tuning for Text-Video Retrieval
H Zhang, P Zeng, L Gao, J Song, HT Shen
IEEE Transactions on Circuits and Systems for Video Technology, 2024
12024
Text-Video Retrieval with Global-Local Semantic Consistent Learning
H Zhang, P Zeng, L Gao, J Song, Y Duan, X Lyu, H Shen
arXiv preprint arXiv:2405.12710, 2024
12024
MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct
R Luo, H Zhang, L Chen, TE Lin, X Liu, Y Wu, M Yang, M Wang, P Zeng, ...
arXiv preprint arXiv:2409.05840, 2024
2024
Pedestrian Attributes Recognition for UAV-Human
H Ni, P Lai, Y Li, P Zeng, H Zhang, J Song
2024 IEEE International Conference on Multimedia and Expo Workshops (ICMEW), 1-5, 2024
2024
MPT: Multi-grained Prompt Tuning for Text-Video Retrieval
H Zhang, P Zeng, L Gao, J Song, HT Shen
ACM Multimedia 2024, 2024
2024
系统目前无法执行此操作,请稍后再试。
文章 1–14