VLP: A survey on vision-language pre-training F Chen, D Zhang, M Han, X Chen, J Shi, S Xu, B Xu arXiv preprint arXiv:2202.09061, 2022 | 169 | 2022 |
X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages F Chen, M Han, H Zhao, Q Zhang, J Shi, S Xu, B Xu arXiv preprint arXiv:2305.04160, 2023 | 73 | 2023 |
Bridging the gap between prior and posterior knowledge selection for knowledge-grounded dialogue generation X Chen, F Meng, P Li, F Chen, S Xu, B Xu, J Zhou Proceedings of the 2020 conference on empirical methods in natural language …, 2020 | 71 | 2020 |
DMRM: A dual-channel multi-hop reasoning model for visual dialog F Chen, F Meng, J Xu, P Li, B Xu, J Zhou Proceedings of the AAAI Conference on Artificial Intelligence 34 (05), 7504-7511, 2020 | 34 | 2020 |
GoG: Relation-aware graph-over-graph network for visual dialog F Chen, X Chen, F Meng, P Li, J Zhou arXiv preprint arXiv:2109.08475, 2021 | 33 | 2021 |
Dualgats: Dual graph attention networks for emotion recognition in conversations D Zhang, F Chen, X Chen Proceedings of the 61st Annual Meeting of the Association for Computational …, 2023 | 30 | 2023 |
Improving cross-modal understanding in visual dialog via contrastive learning F Chen, X Chen, S Xu, B Xu ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 24 | 2022 |
Multimodal incremental transformer with visual grounding for visual dialogue generation F Chen, F Meng, X Chen, P Li, J Zhou arXiv preprint arXiv:2109.08478, 2021 | 19 | 2021 |
Unsupervised knowledge selection for dialogue generation X Chen, F Chen, F Meng, P Li, J Zhou Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 …, 2021 | 17 | 2021 |
Learning to ground visual objects for visual dialog F Chen, X Chen, C Xu, D Jiang arXiv preprint arXiv:2109.06013, 2021 | 16 | 2021 |
Unsupervised and pseudo-supervised vision-language alignment in visual dialog F Chen, D Zhang, X Chen, J Shi, S Xu, B Xu Proceedings of the 30th ACM International Conference on Multimedia, 4142-4153, 2022 | 12 | 2022 |
Decomposing logits distillation for incremental named entity recognition D Zhang, Y Yu, F Chen, X Chen Proceedings of the 46th International ACM SIGIR Conference on Research and …, 2023 | 8 | 2023 |
Knowledge transfer from pre-trained language models to cif-based speech recognizers via hierarchical distillation M Han, F Chen, J Shi, S Xu, B Xu arXiv preprint arXiv:2301.13003, 2023 | 8 | 2023 |
Structure Aware Multi-Graph Network for Multi-Modal Emotion Recognition in Conversations D Zhang, F Chen, J Chang, X Chen, Q Tian IEEE Transactions on Multimedia, 2023 | 6 | 2023 |
HiVLP: Hierarchical vision-language pre-training for fast image-text retrieval F Chen, X Chen, J Shi, D Zhang, J Chang, Q Tian arXiv preprint arXiv:2205.12105, 2022 | 6 | 2022 |
A multi domain knowledge enhanced matching network for response selection in retrieval-based dialogue systems X Chen, F Chen, S Xu, B Xu ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 5 | 2022 |
ViLaS: Exploring the Effects of Vision and Language Context in Automatic Speech Recognition Z Ni, M Han, F Chen, L Meng, J Shi, P Lv, B Xu ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 1 | 2024 |
ViLaS: Integrating Vision and Language into Automatic Speech Recognition M Han, F Chen, Z Ni, L Meng, J Shi, S Xu, B Xu arXiv e-prints, arXiv: 2305.19972, 2023 | 1 | 2023 |
Visual dialog method and apparatus, method and apparatus for training visual dialog model, electronic device, and computer-readable storage medium F Chen, F Meng, P Li, J Zhou US Patent App. 17/989,613, 2023 | | 2023 |
Enhancing Visual Question Answering via Deconstructing Questions and Explicating Answers F Chen, M Han, J Shi, S Xu, B Xu | | |