Multi-Grained Vision Language Pre-Training: Aligning Texts with Visual Concepts Y Zeng, X Zhang, H Li arXiv preprint arXiv:2111.08276, 2021 | 230 | 2021 |
What Matters in Training a GPT4-Style Language Model with Multimodal Inputs? Y Zeng, H Zhang, J Zheng, J Xia, G Wei, Y Wei, Y Zhang, T Kong arXiv preprint arXiv:2307.02469, 2023 | 42 | 2023 |
X-VLM: All-In-One Pre-trained Model For Vision-Language Tasks Y Zeng, X Zhang, H Li, J Wang, J Zhang, W Zhou arXiv preprint arXiv:2211.12402, 2022 | 37 | 2022 |
Make Pixels Dance: High-Dynamic Video Generation Y Zeng, G Wei, J Zheng, J Zou, Y Wei, Y Zhang, H Li arXiv preprint arXiv:2311.10982, 2023 | 28 | 2023 |
Jointly Optimizing State Operation Prediction and Value Generation for Dialogue State Tracking Y Zeng, JY Nie arXiv preprint arXiv:2010.14061, 2020 | 21* | 2020 |
Cross-View Language Modeling: Towards Unified Cross-Lingual Cross-Modal Pre-training Y Zeng, W Zhou, A Luo, X Zhang arXiv preprint arXiv:2206.00621, 2022 | 20 | 2022 |
A Simple and Efficient Multi-Task Learning Approach for Conditioned Dialogue Generation Y Zeng, JY Nie Proceedings of the 2021 Conference of the North American Chapter of the …, 2021 | 15* | 2021 |
VLUE: A Multi-Task Benchmark for Evaluating Vision-Language Models W Zhou, Y Zeng, S Diao, X Zhang arXiv preprint arXiv:2205.15237, 2022 | 12 | 2022 |
Multi-domain dialogue state tracking based on state graph Y Zeng, JY Nie arXiv preprint arXiv:2010.11137, 2020 | 11 | 2020 |
An Investigation of Suitability of Pre-Trained Language Models for Dialogue Generation–Avoiding Discrepancies Y Zeng, JY Nie Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 …, 2021 | 8* | 2021 |