Diffsinger: Singing voice synthesis via shallow diffusion mechanism J Liu, C Li, Y Ren, F Chen, Z Zhao AAAI, 2021 | 226 | 2021 |
Make-an-audio: Text-to-audio generation with prompt-enhanced diffusion models R Huang, J Huang, D Yang, Y Ren, L Liu, M Li, Z Ye, J Liu, X Yin, Z Zhao ICML, 2023 | 147 | 2023 |
Prodiff: Progressive fast diffusion model for high-quality text-to-speech R Huang, Z Zhao, H Liu, J Liu, C Cui, Y Ren Proceedings of the 30th ACM International Conference on Multimedia, 2595-2605, 2022 | 121 | 2022 |
Audiogpt: Understanding and generating speech, music, sound, and talking head R Huang, M Li, D Yang, J Shi, X Chang, Z Ye, Y Wu, Z Hong, J Huang, ... Proceedings of the AAAI Conference on Artificial Intelligence 38 (21), 23802 …, 2024 | 102 | 2024 |
Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus R Huang, F Chen, Y Ren, J Liu, C Cui, Z Zhao ACM MM, 2021 | 78 | 2021 |
PortaSpeech: Portable and High-Quality Generative Text-to-Speech Y Ren*, J Liu*, Z Zhao NeurIPS, 2021 | 70 | 2021 |
GeneFace: Generalized and High-Fidelity Audio-Driven 3D Talking Face Synthesis Z Ye, Z Jiang, Y Ren, J Liu, JZ He, Z Zhao ICLR 2023, 2023 | 67 | 2023 |
A study of non-autoregressive model for sequence generation Y Ren*, J Liu*, X Tan, S Zhao, Z Zhao, TY Liu ACL, 2020 | 67 | 2020 |
SimulSpeech: End-to-end simultaneous speech to text translation Y Ren*, J Liu*, X Tan, C Zhang, T Qin, Z Zhao, TY Liu ACL, 2020 | 64 | 2020 |
Generspeech: Towards style transfer for generalizable out-of-domain text-to-speech R Huang, Y Ren, J Liu, C Cui, Z Zhao NeurIPS, 2022 | 63 | 2022 |
Singgan: Generative adversarial network for high-fidelity singing voice generation R Huang, C Cui, F Chen, Y Ren, J Liu, Z Zhao, B Huai, Z Wang Proceedings of the 30th ACM International Conference on Multimedia, 2525-2535, 2022 | 51 | 2022 |
M4singer: A multi-style, multi-singer and musical score provided mandarin singing corpus L Zhang, R Li, S Wang, L Deng, J Liu, Y Ren, J He, R Huang, J Zhu, ... Advances in Neural Information Processing Systems 35, 6914-6926, 2022 | 49 | 2022 |
Denoispeech: Denoising text to speech with frame-level noise modeling C Zhang, Y Ren, X Tan, J Liu, K Zhang, T Qin, S Zhao, TY Liu ICASSP, 2021 | 44 | 2021 |
TranSpeech: Speech-to-Speech Translation With Bilateral Perturbation R Huang*, J Liu*, H Liu*, Y Ren, L Zhang, J He, Z Zhao ICLR, 2022 | 37 | 2022 |
Task-Level Curriculum Learning for Non-Autoregressive Neural Machine Translation J Liu, Y Ren, X Tan, C Zhang, T Qin, Z Zhao, TY Liu IJCAI, 2020 | 36 | 2020 |
Mega-tts: Zero-shot text-to-speech at scale with intrinsic inductive bias Z Jiang, Y Ren, Z Ye, J Liu, C Zhang, Q Yang, S Ji, R Huang, C Wang, ... arXiv preprint arXiv:2306.03509, 2023 | 31 | 2023 |
EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model C Cui, Y Ren, J Liu, F Chen, R Huang, M Lei, Z Zhao INTERSPEECH 2021, 2021 | 26 | 2021 |
SimulSLT: End-to-End Simultaneous Sign Language Translation A Yin, Z Zhao, J Liu, W Jin, M Zhang, X Zeng, X He ACM MM 2021, 2021 | 25 | 2021 |
Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis Z Jiang, J Liu*, Y Ren, J He, Z Ye, S Ji, Q Yang, C Zhang, P Wei, C Wang, ... The Twelfth International Conference on Learning Representations, 2023 | 19* | 2023 |
Make-an-audio 2: Temporal-enhanced text-to-audio generation J Huang, Y Ren, R Huang, D Yang, Z Ye, C Zhang, J Liu, X Yin, Z Ma, ... arXiv preprint arXiv:2305.18474, 2023 | 16 | 2023 |