VQTTS: High-fidelity text-to-speech synthesis with self-supervised VQ acoustic feature C Du, Y Guo, X Chen, K Yu Proc. Interspeech 2022, 1596-1600, 2022 | 54 | 2022 |
Speaker augmentation for low resource speech recognition C Du, K Yu ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 37 | 2020 |
Dae-talker: High fidelity speech-driven talking face generation with diffusion autoencoder C Du, Q Chen, T He, X Tan, X Chen, K Yu, S Zhao, J Bian Proceedings of the 31st ACM International Conference on Multimedia, 4281-4289, 2023 | 29 | 2023 |
Data augmentation for end-to-end code-switching speech recognition C Du, H Li, Y Lu, L Wang, Y Qian 2021 IEEE Spoken Language Technology Workshop (SLT), 194-200, 2021 | 26 | 2021 |
Rich prosody diversity modelling with phone-level mixture density network C Du, K Yu Proc. Interspeech 2021, 3136-3140, 2021 | 25* | 2021 |
Emodiff: Intensity controllable emotional text-to-speech with soft-label guidance Y Guo, C Du, X Chen, K Yu ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 24 | 2023 |
Phone-level prosody modelling with gmm-based mdn for diverse and controllable speech synthesis C Du, K Yu IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 190-201, 2021 | 21* | 2021 |
UniCATS: A unified context-aware text-to-speech framework with contextual vq-diffusion and vocoding C Du, Y Guo, F Shen, Z Liu, Z Liang, X Chen, S Wang, H Zhang, K Yu Proceedings of the AAAI Conference on Artificial Intelligence 38 (16), 17924 …, 2024 | 20 | 2024 |
Towards data selection on tts data for children’s speech recognition W Wang, Z Zhou, Y Lu, H Wang, C Du, Y Qian ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 15 | 2021 |
Unsupervised word-level prosody tagging for controllable speech synthesis Y Guo, C Du, K Yu ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 12 | 2022 |
Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS Y Yang, F Shen, C Du, Z Ma, K Yu, D Povey, X Chen ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 8 | 2024 |
Synaug: Synthesis-based data augmentation for text-dependent speaker verification C Du, B Han, S Wang, Y Qian, K Yu ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 8 | 2021 |
Neural fusion for voice cloning B Chen, C Du, K Yu IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1993-2001, 2022 | 7 | 2022 |
VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching Y Guo, C Du, Z Ma, X Chen, K Yu ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 6 | 2024 |
VALL-T: Decoder-Only Generative Transducer for Robust and Decoding-Controllable Text-to-Speech C Du, Y Guo, H Wang, Y Yang, Z Niu, S Wang, H Zhang, X Chen, K Yu arXiv preprint arXiv:2401.14321, 2024 | 5 | 2024 |
Speaker Adaptive Text-to-Speech with Timbre-Normalized Vector-Quantized Feature C Du, Y Guo, X Chen, K Yu IEEE/ACM Transactions on Audio, Speech, and Language Processing 31, 3446-3456, 2023 | 5 | 2023 |
Improving Code-Switching and Named Entity Recognition in ASR with Speech Editing based Data Augmentation Z Liang, Z Song, Z Ma, C Du, K Yu, X Chen Proc. Interspeech 2023, 919-923, 2023 | 4 | 2023 |
Acoustic Word Embeddings for End-to-End Speech Synthesis F Shen, C Du, K Yu Applied Sciences 11 (19), 9010, 2021 | 4 | 2021 |
Acoustic bpe for speech generation with discrete tokens F Shen, Y Guo, C Du, X Chen, K Yu ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 3 | 2024 |
DSE-TTS: Dual Speaker Embedding for Cross-Lingual Text-to-Speech S Liu, Y Guo, C Du, X Chen, K Yu Proc. Interspeech 2023, 616-620, 2023 | 3 | 2023 |