Cross-speaker emotion transfer through information perturbation in emotional speech synthesis Y Lei, S Yang, X Zhu, L Xie, D Su IEEE Signal Processing Letters 29, 1948-1952, 2022 | 11 | 2022 |
Multi-speaker expressive speech synthesis via multiple factors decoupling X Zhu, Y Lei, K Song, Y Zhang, T Li, L Xie ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 9 | 2023 |
SELM: Speech enhancement using discrete tokens and language models Z Wang, X Zhu, Z Zhang, YJ Lv, N Jiang, G Zhao, L Xie ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 6 | 2024 |
DiCLET-TTS: Diffusion model based cross-lingual emotion transfer for text-to-speech—A study between English and Mandarin T Li, C Hu, J Cong, X Zhu, J Li, Q Tian, Y Wang, L Xie IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023 | 6 | 2023 |
Vec-tok speech: Speech vectorization and tokenization for neural speech generation X Zhu, Y Lv, Y Lei, T Li, W He, H Zhou, H Lu, L Xie arXiv preprint arXiv:2310.07246, 2023 | 5 | 2023 |
Metts: Multilingual emotional text-to-speech by cross-speaker and cross-lingual emotion transfer X Zhu, Y Lei, T Li, Y Zhang, H Zhou, H Lu, L Xie IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024 | 4 | 2024 |
SponTTS: modeling and transferring spontaneous style for TTS H Li, X Zhu, L Xue, Y Song, Y Chen, L Xie ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 1 | 2024 |
HIGNN-TTS: Hierarchical Prosody Modeling With Graph Neural Networks for Expressive Long-Form TTS D Guo, X Zhu, L Xue, T Li, Y Lv, Y Jiang, L Xie 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-7, 2023 | 1 | 2023 |
Zero-Shot Emotion Transfer for Cross-Lingual Speech Synthesis Y Li, X Zhu, Y Lei, H Li, J Liu, D Xie, L Xie 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-8, 2023 | 1 | 2023 |
Accent-VITS: accent transfer for end-to-end TTS L Ma, Y Zhang, X Zhu, Y Lei, Z Ning, P Zhu, L Xie National Conference on Man-Machine Speech Communication, 203-214, 2023 | 1 | 2023 |
Multi-Speaker Expressive Speech Synthesis via Semi-supervised Contrastive Learning X Zhu, Y Li, Y Lei, N Jiang, G Zhao, L Xie arXiv preprint arXiv:2310.17101, 2023 | 1 | 2023 |
Vec-Tok-VC+: Residual-enhanced Robust Zero-shot Voice Conversion with Progressive Constraints in a Dual-mode Training Strategy L Ma, X Zhu, Y Lv, Z Wang, Z Wang, W He, H Zhou, L Xie arXiv preprint arXiv:2406.09844, 2024 | | 2024 |
Single-Codec: Single-Codebook Speech Codec towards High-Performance Speech Generation H Li, L Xue, H Guo, X Zhu, Y Lv, L Xie, Y Chen, H Yin, Z Li arXiv preprint arXiv:2406.07422, 2024 | | 2024 |
Text-aware and Context-aware Expressive Audiobook Speech Synthesis D Guo, X Zhu, L Xue, Y Zhang, W Tian, L Xie arXiv preprint arXiv:2406.05672, 2024 | | 2024 |
U-Style: Cascading U-nets with Multi-level Speaker and Style Modeling for Zero-Shot Voice Cloning T Li, Z Wang, X Zhu, J Cong, Q Tian, Y Wang, L Xie arXiv preprint arXiv:2310.04004, 2023 | | 2023 |
The NPU-ASLP System for The ISCSLP 2022 Magichub Code-Swiching ASR Challenge Y Liang, P Chen, F Yu, X Zhu, T Xu, Y Gao, L Xie 2022 13th International Symposium on Chinese Spoken Language Processing …, 2022 | | 2022 |
Contrastive Context-Speech Pretraining for Expressive Text-to-Speech Synthesis Y Xiao, X Wang, X Tan, L He, X Zhu, T Lee ACM Multimedia 2024, 0 | | |
UniStyle: Unified Style Modeling for Speaking Style Captioning and Stylistic Speech Synthesis X Zhu, W Tian, X Wang, L He, Y Xiao, X Wang, X Tan, L Xie ACM Multimedia 2024, 0 | | |