Make-an-audio: Text-to-audio generation with prompt-enhanced diffusion models R Huang, J Huang, D Yang, Y Ren, L Liu, M Li, Z Ye, J Liu, X Yin, Z Zhao International Conference on Machine Learning, 13916-13932, 2023 | 216 | 2023 |
Audiogpt: Understanding and generating speech, music, sound, and talking head R Huang, M Li, D Yang, J Shi, X Chang, Z Ye, Y Wu, Z Hong, J Huang, ... Proceedings of the AAAI Conference on Artificial Intelligence 38 (21), 23802 …, 2024 | 130 | 2024 |
Make-an-audio 2: Temporal-enhanced text-to-audio generation J Huang, Y Ren, R Huang, D Yang, Z Ye, C Zhang, J Liu, X Yin, Z Ma, ... arXiv preprint arXiv:2305.18474, 2023 | 32 | 2023 |
Geneface++: Generalized and stable real-time audio-driven 3d talking face generation Z Ye, J He, Z Jiang, R Huang, J Huang, J Liu, Y Ren, X Yin, Z Ma, Z Zhao arXiv preprint arXiv:2305.00787, 2023 | 18 | 2023 |
Frieren: Efficient Video-to-Audio Generation with Rectified Flow Matching Y Wang, W Guo, R Huang, J Huang, Z Wang, F You, R Li, Z Zhao arXiv preprint arXiv:2406.00320, 2024 | 3 | 2024 |
MulliVC: Multi-lingual Voice Conversion With Cycle Consistency J Huang, C Zhang, Y Ren, Z Jiang, Z Ye, J Liu, J He, X Yin, Z Zhao arXiv preprint arXiv:2408.04708, 2024 | | 2024 |