关注
Zhou Yu (余宙)
标题
引用次数
引用次数
年份
Deep modular co-attention networks for visual question answering
Z Yu, J Yu, Y Cui, D Tao, Q Tian
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 6281-6290, 2019
9502019
Multi-modal factorized bilinear pooling with co-attention learning for visual question answering
Z Yu, J Yu, J Fan, D Tao
IEEE International Conference on Computer Vision (ICCV), 1821-1830, 2017
7942017
Beyond bilinear: Generalized multimodal factorized high-order pooling for visual question answering
Z Yu, J Yu, C Xiang, J Fan, D Tao
IEEE Transactions on Neural Networks and Learning Systems 29 (12), 5947-5959, 2018
5222018
Multimodal transformer with multi-view visual representation for image captioning
J Yu, J Li, Z Yu, Q Huang
IEEE Transactions on Circuits and Systems for Video Technology 30 (12), 4467 …, 2020
4052020
ActivityNet-QA: A dataset for understanding complex web videos via question answering
Z Yu, D Xu, J Yu, T Yu, Z Zhao, Y Zhuang, D Tao
Proceedings of the AAAI Conference on Artificial Intelligence, 9127-9134, 2019
2972019
Sparse multi-modal hashing
F Wu, Z Yu, Y Yang, S Tang, Y Zhang, Y Zhuang
IEEE Transactions on Multimedia 16 (2), 427 - 439, 2014
1482014
Prompting large language models with answer heuristics for knowledge-based visual question answering
Z Shao, Z Yu, M Wang, J Yu
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 14974-14983, 2023
1412023
Rethinking diversified and discriminative proposal generation for visual grounding
Z Yu, J Yu, C Xiang, Z Zhao, Q Tian, D Tao
International Joint Conference on Artificial Intelligence (IJCAI), 1114-1120, 2018
1342018
Discriminative coupled dictionary hashing for fast cross-media retrieval
Z Yu, F Wu, Y Yang, Q Tian, J Luo, Y Zhuang
Proceedings of the 37th international ACM SIGIR conference on Research …, 2014
1312014
Deep multimodal neural architecture search
Z Yu, Y Cui, J Yu, M Wang, D Tao, Q Tian
Proceedings of the 28th ACM International Conference on Multimedia, 3743-3752, 2020
942020
SPRNet: Single pixel reconstruction for one-stage instance segmentation
J Yu, J Yao, J Zhang, Z Yu, D Tao
IEEE Transactions on Cybernetics 51 (4), 1731-1742, 2021
812021
Open-ended long-form video question answering via adaptive hierarchical reinforced networks
Z Zhao, Z Zhang, S Xiao, Z Yu, J Yu, D Cai, F Wu, Y Zhuang
International Joint Conference on Artificial Intelligence (IJCAI), 3683-3689, 2018
692018
MARN: Multi-level attentional reconstruction networks for weakly supervised video temporal grounding
Y Song, J Wang, L Ma, J Yu, J Liang, L Yuan, Z Yu
Neurocomputing 554, 126625, 2023
56*2023
ROSITA: Enhancing vision-and-language semantic alignments via cross-and intra-modal knowledge integration
Y Cui, Z Yu, C Wang, Z Zhao, J Zhang, M Wang, J Yu
Proceedings of the 29th ACM International Conference on Multimedia, 797-806, 2021
562021
Long-term video question answering via multimodal hierarchical memory attentive networks
T Yu, J Yu, Z Yu, Q Huang, Q Tian
IEEE Transactions on Circuits and Systems for Video Technology 31 (3), 931-944, 2020
522020
Compositional attention networks with two-stream fusion for video question answering
T Yu, J Yu, Z Yu, D Tao
IEEE Transactions on Image Processing 29, 1204-1218, 2019
432019
Multimodal unified attention networks for vision-and-language interactions
Z Yu, Y Cui, J Yu, D Tao, Q Tian
arXiv preprint arXiv:1908.04107, 2019
432019
Comprehensive distance-preserving autoencoders for cross-modal retrieval
Y Zhan, J Yu, Z Yu, R Zhang, D Tao, Q Tian
Proceedings of the 26th ACM international conference on Multimedia, 1137-1145, 2018
372018
Cross-media hashing with neural networks
Y Zhuang, Z Yu, W Wang, F Wu, S Tang, J Shao
Proceedings of the 22nd ACM international conference on Multimedia, 901-904, 2014
352014
Accelerated masked transformer for dense video captioning
Z Yu, N Han
Neurocomputing 445, 72-80, 2021
222021
系统目前无法执行此操作,请稍后再试。
文章 1–20