关注
Dian Li
Dian Li
Tencent.com
在 tencent.com 的电子邮件经过验证
标题
引用次数
引用次数
年份
Video-based emotion recognition using CNN-RNN and C3D hybrid networks
Y Fan, X Lu, D Li, Y Liu
Proceedings of the 18th ACM international conference on multimodal …, 2016
6672016
Bridging video-text retrieval with multiple choice questions
Y Ge, Y Ge, X Liu, D Li, Y Shan, X Qie, P Luo
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022
1512022
Clip4caption: Clip for video caption
M Tang, Z Wang, Z Liu, F Rao, D Li, X Li
Proceedings of the 29th ACM International Conference on Multimedia, 4858-4862, 2021
1232021
Transform domain transcoding from MPEG-2 to H. 264 with interpolation drift-error compensation
T Qian, J Sun, D Li, X Yang, J Wang
IEEE Transactions on Circuits and Systems for Video Technology 16 (4), 523-534, 2006
452006
Masked image modeling with denoising contrast
K Yi, Y Ge, X Li, S Yang, D Li, J Wu, Y Shan, X Qie
arXiv preprint arXiv:2205.09616, 2022
372022
Learning scale-consistent attention part network for fine-grained image recognition
H Liu, J Li, D Li, J See, W Lin
IEEE Transactions on Multimedia 24, 2902-2913, 2021
292021
Enhancing self-supervised video representation learning via multi-level feature optimization
R Qian, Y Li, H Liu, J See, S Ding, X Liu, D Li, W Lin
Proceedings of the IEEE/CVF international conference on computer vision …, 2021
282021
Taggpt: Large language models are zero-shot multimodal taggers
C Li, Y Ge, J Mao, D Li, Y Shan
arXiv preprint arXiv:2304.03022, 2023
172023
Ca-ssl: Class-agnostic semi-supervised learning for detection and segmentation
L Qi, J Kuen, Z Lin, J Gu, F Rao, D Li, W Guo, Z Wen, MH Yang, J Jia
European Conference on Computer Vision, 59-77, 2022
132022
Tencent-mvse: A large-scale benchmark dataset for multi-modal video similarity evaluation
Z Zeng, Y Luo, Z Liu, F Rao, D Li, W Guo, Z Wen
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022
102022
Rils: Masked visual reconstruction in language semantic space
S Yang, Y Ge, K Yi, D Li, Y Shan, X Qie, X Wang
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
8*2023
Clip4caption++: Multi-clip for video caption
M Tang, Z Wang, Z Zeng, F Rao, D Li
arXiv preprint arXiv:2110.05204, 2021
82021
Vision-language instruction tuning: A review and analysis
C Li, Y Ge, D Li, Y Shan
Transactions on Machine Learning Research, 2023
52023
Transform domain transcoding from MPEG-2 to H. 264 with interpolation error drift compensation
T Qian, J Sun, D Li, X Yang, J Wang
IEEE Workshop on Signal Processing Systems Design and Implementation, 2005 …, 2005
42005
Unified Pretraining Target Based Video-music Retrieval With Music Rhythm And Video Optical Flow Information
T Mao, S Liu, Y Zhang, D Li, Y Shan
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
22024
Controllable augmentations for video representation learning
R Qian, W Lin, J See, D Li
Visual Intelligence 2 (1), 1, 2024
22024
Video-CCAM: Enhancing Video-Language Understanding with Causal Cross-Attention Masks for Short and Long Videos
J Fei, D Li, Z Deng, Z Wang, G Liu, H Wang
arXiv preprint arXiv:2408.14023, 2024
12024
MetaTool: Facilitating Large Language Models to Master Tools with Meta-task Augmentation
X Wang, D Li, Y Zhao, H Wang
arXiv preprint arXiv:2407.12871, 2024
2024
Humtrans: A Novel Open-Source Dataset for Humming Melody Transcription and Beyond
S Liu, X Li, D Li, Y Shan
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
2024
系统目前无法执行此操作,请稍后再试。
文章 1–19