Instancerefer: Cooperative holistic understanding for visual grounding on point clouds through instance multi-level contextual referring Z Yuan, X Yan, Y Liao, R Zhang, S Wang, Z Li, S Cui Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2021 | 98 | 2021 |
X-trans2cap: Cross-modal knowledge transfer using transformer for 3d dense captioning Z Yuan, X Yan, Y Liao, Y Guo, G Li, S Cui, Z Li Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022 | 62 | 2022 |
Comprehensive visual question answering on point clouds through compositional scene manipulation X Yan, Z Yuan, Y Du, Y Liao, Y Guo, S Cui, Z Li IEEE Transactions on Visualization and Computer Graphics, 2023 | 25* | 2023 |
Toward explainable and fine-grained 3d grounding through referring textual phrases Z Yuan, X Yan, Z Li, X Li, Y Guo, S Cui, Z Li arXiv preprint arXiv:2207.01821, 2022 | 13 | 2022 |
Revisiting hard example for action recognition J Wang, J Hu, S Li, Z Yuan IEEE Transactions on Circuits and Systems for Video Technology 31 (2), 546-556, 2020 | 6 | 2020 |
Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding Z Yuan, J Ren, CM Feng, H Zhao, S Cui, Z Li arXiv preprint arXiv:2311.15383, 2023 | 4 | 2023 |
Rethinking temporal-related sample for human action recognition J Wang, S Li, Z Duan, Z Yuan ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 1 | 2020 |
Instance-free Text to Point Cloud Localization with Relative Position Awareness L Wang, Z Yuan, J Ren, S Cui, Z Li arXiv preprint arXiv:2404.17845, 2024 | | 2024 |
GSmoothFace: Generalized Smooth Talking Face Generation via Fine Grained 3D Face Guidance H Zhang, Z Yuan, C Zheng, X Yan, B Wang, G Li, S Wu, S Cui, Z Li arXiv preprint arXiv:2312.07385, 2023 | | 2023 |