Speechgpt: Empowering large language models with intrinsic cross-modal conversational abilities D Zhang, S Li, X Zhang, J Zhan, P Wang, Y Zhou, X Qiu EMNLP 2023 (Findings), 2023 | 126 | 2023 |
SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models X Zhang*, D Zhang*, S Li, Y Zhou, X Qiu ICLR 2024, 2023 | 40* | 2023 |
AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling J Zhan, J Dai, J Ye, Y Zhou, D Zhang, Z Liu, X Zhang, R Yuan, G Zhang, ... ACL 2024, 2024 | 26 | 2024 |
SeqXGPT: Sentence-Level AI-Generated Text Detection P Wang, L Li, K Ren, B Jiang, D Zhang, X Qiu EMNLP 2023, 2023 | 21 | 2023 |
DUB: Discrete Unit Back-translation for Speech Translation D Zhang, R Ye, T Ko, M Wang, Y Zhou ACL 2023 (Findings), 2023 | 17 | 2023 |
Inferaligner: Inference-time alignment for harmlessness through cross-model guidance P Wang, D Zhang, L Li, C Tan, X Wang, K Ren, B Jiang, X Qiu arXiv preprint arXiv:2401.11206, 2024 | 11 | 2024 |
GroundingGPT: Language Enhanced Multi-modal Grounding Model Z Li, Q Xu, D Zhang, H Song, Y Cai, Q Qi, R Zhou, J Pan, Z Li, VT Vu, ... ACL 2024, 2024 | 8* | 2024 |
SpeechAlign: Aligning Speech Generation to Human Preferences D Zhang, Z Li, S Li, X Zhang, P Wang, Y Zhou, X Qiu arXiv preprint arXiv:2404.05600, 2024 | 4 | 2024 |
GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators Y Hu, C Chen, CHH Yang, R Li, D Zhang, Z Chen, ES Chng ACL 2024, 2024 | 2 | 2024 |
SpeechGPT-Gen: Scaling Chain-of-Information Speech Generation D Zhang, X Zhang, J Zhan, S Li, Y Zhou, X Qiu arXiv preprint arXiv:2401.13527, 2024 | 2 | 2024 |
SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems D Zhang, Z Li, P Wang, X Zhang, Y Zhou, X Qiu arXiv preprint arXiv:2401.03945, 2024 | 1 | 2024 |