A novel learnable dictionary encoding layer for end-to-end language identification W Cai, Z Cai, X Zhang, X Wang, M Li 2018 IEEE international conference on acoustics, speech and signal …, 2018 | 83 | 2018 |
From speaker verification to multispeaker speech synthesis, deep transfer with feedback constraint Z Cai, C Zhang, M Li Proc. Interspeech 2020, 3974--3978, 2020 | 45 | 2020 |
Insights in-to-end learning scheme for language identification W Cai, Z Cai, W Liu, X Wang, M Li 2018 IEEE international conference on acoustics, speech and signal …, 2018 | 39 | 2018 |
Polyphone disambiguation for mandarin chinese using conditional neural network with multi-level embedding features Z Cai, Y Yang, C Zhang, X Qin, M Li Proc. Interspeech 2019, 2110--2114, 2019 | 30 | 2019 |
End-to-end language identification using NetFV and NetVLAD J Chen, W Cai, D Cai, Z Cai, H Zhong, M Li 2018 11th International Symposium on Chinese Spoken Language Processing …, 2018 | 15 | 2018 |
Waveform boundary detection for partially spoofed audio Z Cai, W Wang, M Li ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 14 | 2023 |
The DKU-JNU-EMA electromagnetic articulography database on Mandarin and Chinese dialects with tandem feature based acoustic-to-articulatory inversion Z Cai, X Qin, D Cai, M Li, X Liu, H Zhong 2018 11th International Symposium on Chinese Spoken Language Processing …, 2018 | 12 | 2018 |
Cross-lingual multispeaker text-to-speech under limited-data scenario Z Cai, Y Yang, M Li arXiv preprint arXiv:2005.10441, 2020 | 11 | 2020 |
Cross-lingual multi-speaker speech synthesis with limited bilingual training data Z Cai, Y Yang, M Li Computer Speech & Language 77, 101427, 2023 | 10 | 2023 |
Sig-vc: A speaker information guided zero-shot voice conversion system for both human beings and machines H Zhang, Z Cai, X Qin, M Li ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 10 | 2022 |
Identifying source speakers for voice conversion based spoofing attacks on speaker verification systems D Cai, Z Cai, M Li ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 7 | 2023 |
Deep speaker embeddings with convolutional neural network on supervector for text-independent speaker recognition D Cai, Z Cai, M Li 2018 Asia-Pacific Signal and Information Processing Association Annual …, 2018 | 7 | 2018 |
Electrolaryngeal speech enhancement based on a two stage framework with bottleneck feature refinement and voice conversion Y Yang, H Zhang, Z Cai, Y Shi, M Li, D Zhang, X Ding, J Deng, J Wang Biomedical Signal Processing and Control 80, 104279, 2023 | 5 | 2023 |
Integrating frame-level boundary detection and deepfake detection for locating manipulated regions in partially spoofed audio forgery attacks Z Cai, M Li Computer Speech & Language 85, 101597, 2024 | 4 | 2024 |
Unsupervised query by example spoken term detection using features concatenated with self-organizing map distances H Wu, M Li, Z Cai, H Zhong 2018 11th International Symposium on Chinese Spoken Language Processing …, 2018 | 4 | 2018 |
The DKU-DUKEECE System for the Manipulation Region Location Task of ADD 2023 Z Cai, W Wang, Y Wang, M Li arXiv preprint arXiv:2308.10281, 2023 | 2 | 2023 |
F0 Contour Estimation Using Phonetic Feature in Electrolaryngeal Speech Enhancement Z Cai, Z Xu, M Li ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019 | 2 | 2019 |
Invertible Voice Conversion Z Cai, M Li arXiv preprint arXiv:2201.10687, 2022 | 1 | 2022 |
The DKU Speech Synthesis System for 2019 Blizzard Challenge Z Cai, C Zhang, Y Yang, M Li Blizzard Challenge Workshop, 2019 | 1 | 2019 |
Invertible Voice Conversion with Parallel Data Z Cai, M Li ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | | 2024 |