Multiple sound sources localization from coarse to fine R Qian, D Hu, H Dinkel, M Wu, N Xu, W Lin Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23 …, 2020 | 155 | 2020 |
Discriminative sounding objects localization via self-supervised audiovisual matching D Hu, R Qian, M Jiang, X Tan, S Wen, E Ding, W Lin, D Dou Advances in Neural Information Processing Systems 33, 10077-10087, 2020 | 139 | 2020 |
Human in events: A large-scale benchmark for human-centric video analysis in complex events W Lin, H Liu, S Liu, Y Li, R Qian, T Wang, N Xu, H Xiong, GJ Qi, N Sebe arXiv preprint arXiv:2005.04490, 2020 | 116 | 2020 |
ATRW: a benchmark for Amur tiger re-identification in the wild S Li, J Li, H Tang, R Qian, W Lin arXiv preprint arXiv:1906.05586, 2019 | 102 | 2019 |
Learning hierarchical cross-modal association for co-speech gesture generation X Liu, Q Wu, H Zhou, Y Xu, R Qian, X Lin, X Zhou, W Wu, B Dai, B Zhou Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022 | 88 | 2022 |
Taming diffusion models for audio-driven co-speech gesture generation L Zhu, X Liu, X Liu, R Qian, Z Liu, L Yu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 68 | 2023 |
Ta2n: Two-stage action alignment network for few-shot action recognition S Li, H Liu, R Qian, Y Li, J See, M Fei, X Yu, W Lin Proceedings of the AAAI Conference on Artificial Intelligence 36 (2), 1404-1411, 2022 | 61 | 2022 |
Motion-aware contrastive video representation learning via foreground-background merging S Ding, M Li, T Yang, R Qian, H Xu, Q Chen, J Wang, H Xiong Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2022 | 50 | 2022 |
Class-aware sounding objects localization via audiovisual correspondence D Hu, Y Wei, R Qian, W Lin, R Song, JR Wen IEEE Transactions on Pattern Analysis and Machine Intelligence 44 (12), 9844 …, 2021 | 35 | 2021 |
Enhancing self-supervised video representation learning via multi-level feature optimization R Qian, Y Li, H Liu, J See, S Ding, X Liu, D Li, W Lin Proceedings of the IEEE/CVF international conference on computer vision …, 2021 | 27 | 2021 |
Visual sound localization in the wild by cross-modal interference erasing X Liu, R Qian, H Zhou, D Hu, W Lin, Z Liu, B Zhou, X Zhou Proceedings of the AAAI Conference on Artificial Intelligence 36 (2), 1801-1809, 2022 | 26 | 2022 |
Static and dynamic concepts for self-supervised video representation learning R Qian, S Ding, X Liu, D Lin European Conference on Computer Vision, 145-164, 2022 | 23 | 2022 |
Dual contrastive learning for spatio-temporal representation S Ding, R Qian, H Xiong Proceedings of the 30th ACM international conference on multimedia, 5649-5658, 2022 | 17 | 2022 |
Finding action tubes with a sparse-to-dense framework Y Li, W Lin, T Wang, J See, R Qian, N Xu, L Wang, S Xu Proceedings of the AAAI Conference on Artificial Intelligence 34 (07), 11466 …, 2020 | 15 | 2020 |
Motion-inductive self-supervised object discovery in videos S Ding, W Xie, Y Chen, R Qian, X Zhang, H Xiong, Q Tian arXiv preprint arXiv:2210.00221, 2022 | 11 | 2022 |
Semantics meets temporal correspondence: Self-supervised object-centric learning in videos R Qian, S Ding, X Liu, D Lin Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | 10 | 2023 |
Ttan: Two-stage temporal alignment network for few-shot action recognition S Li, H Liu, R Qian, Y Li, J See, M Fei, X Yu, W Lin arXiv preprint, 2021 | 8 | 2021 |
Prune spatio-temporal tokens by semantic-aware temporal accumulation S Ding, P Zhao, X Zhang, R Qian, H Xiong, Q Tian Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | 6 | 2023 |
Songcomposer: A large language model for lyric and melody composition in song generation S Ding, Z Liu, X Dong, P Zhang, R Qian, C He, D Lin, J Wang arXiv preprint arXiv:2402.17645, 2024 | 4 | 2024 |
Streaming long video understanding with large language models R Qian, X Dong, P Zhang, Y Zang, S Ding, D Lin, J Wang arXiv preprint arXiv:2405.16009, 2024 | 3 | 2024 |