Make-an-audio: Text-to-audio generation with prompt-enhanced diffusion models R Huang, J Huang, D Yang, Y Ren, L Liu, M Li, Z Ye, J Liu, X Yin, Z Zhao ICML 2023, 2023 | 185 | 2023 |
Audiogpt: Understanding and generating speech, music, sound, and talking head. R Huang, M Li, D Yang, J Shi, X Chang, Z Ye, Y Wu, Z Hong, J Huang, ... AAAI 2023, 2023 | 128 | 2023 |
GeneFace: Generalized and High-Fidelity Audio-Driven 3D Talking Face Synthesis Z Ye, Z Jiang, Y Ren, J Liu, JZ He, Z Zhao ICLR 2023, 2023 | 78 | 2023 |
Multi-agent deep reinforcement learning for voltage control with coordinated active and reactive power optimization D Hu, Z Ye, Y Gao, Z Ye, Y Peng, N Yu IEEE Transactions on Smart Grid 13 (6), 4873-4886, 2022 | 64 | 2022 |
Multi-UAV navigation for partially observable communication coverage by graph reinforcement learning Z Ye, K Wang, Y Chen, X Jiang, G Song IEEE transactions on mobile computing 22 (7), 4056-4069, 2022 | 62 | 2022 |
Mega-tts: Zero-shot text-to-speech at scale with intrinsic inductive bias Z Jiang, Y Ren, Z Ye, J Liu, C Zhang, Q Yang, S Ji, R Huang, C Wang, ... arXiv preprint arXiv:2306.03509, 2023 | 39 | 2023 |
Make-an-audio 2: Temporal-enhanced text-to-audio generation J Huang, Y Ren, R Huang, D Yang, Z Ye, C Zhang, J Liu, X Yin, Z Ma, ... arXiv preprint arXiv:2305.18474, 2023 | 26 | 2023 |
Make-a-voice: Unified voice synthesis with discrete representation R Huang, C Zhang, Y Wang, D Yang, L Liu, Z Ye, Z Jiang, C Weng, ... arXiv preprint arXiv:2305.19269, 2023 | 22 | 2023 |
SyntaSpeech: Syntax-Aware Generative Adversarial Text-to-Speech Z Ye, Z Zhao, Y Ren, F Wu IJCAI 2022, 2022 | 22 | 2022 |
Improving Sample Efficiency in Multi-Agent Actor-Critic Methods Z Ye, Y Chen, X Jiang, G Song, B Yang, S Fan Applied Intelligence, 1-14, 2022 | 20 | 2022 |
Mega-tts 2: Zero-shot text-to-speech with arbitrary length speech prompts Z Jiang, J Liu, Y Ren, J He, C Zhang, Z Ye, P Wei, C Wang, X Yin, Z Ma, ... arXiv preprint arXiv:2307.07218, 2023 | 17 | 2023 |
Geneface++: Generalized and stable real-time audio-driven 3d talking face generation Z Ye, J He, Z Jiang, R Huang, J Huang, J Liu, Y Ren, X Yin, Z Ma, Z Zhao arXiv preprint arXiv:2305.00787, 2023 | 16 | 2023 |
CLAPSpeech: Learning Prosody from Text Context with Contrastive Language-Audio Pre-training Z Ye, R Huang, Y Ren, Z Jiang, J Liu, J He, X Yin, Z Zhao ACL 2023 (Main Conference), 2023 | 14 | 2023 |
Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis Z Ye, T Zhong, Y Ren, J Yang, W Li, J Huang, Z Jiang, J He, R Huang, ... ICLR 2024 (Spotlight), 2024 | 12 | 2024 |
AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation R Huang, H Liu, X Cheng, Y Ren, L Li, Z Ye, J He, L Zhang, J Liu, X Yin, ... ACL 2023 (Main Conference), 2023 | 10 | 2023 |
RMSSinger: Realistic-Music-Score based Singing Voice Synthesis J He, J Liu, Z Ye, R Huang, C Cui, H Liu, Z Zhao ACL 2023 (Findings), 2023 | 9 | 2023 |
Scalable and transferable reinforcement learning for multi-agent mixed cooperative–competitive environments based on hierarchical graph attention Y Chen, G Song, Z Ye, X Jiang Entropy 24 (4), 563, 2022 | 9 | 2022 |
FluentSpeech: Stutter-Oriented Automatic Speech Editing with Context-Aware Diffusion Models Z Jiang, Q Yang, J Zuo, Z Ye, R Huang, Y Ren, Z Zhao ACL 2023 (Findings), 2023 | 6 | 2023 |
Space-air-ground integrated mobile crowdsensing for partially observable data collection by multi-scale convolutional graph reinforcement learning Y Ren, Z Ye, G Song, X Jiang Entropy 24 (5), 638, 2022 | 6 | 2022 |
Experience augmentation: Boosting and accelerating off-policy multi-agent reinforcement learning Z Ye, Y Chen, G Song, B Yang, S Fan arXiv preprint arXiv:2005.09453, 2020 | 6 | 2020 |