Speaker adaptation using spectro-temporal deep features for dysarthric and elderly speech recognition M Geng, X Xie, Z Ye, T Wang, G Li, S Hu, X Liu, H Meng IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 2597-2611, 2022 | 27 | 2022 |
Exploring self-supervised pre-trained asr models for dysarthric and elderly speech recognition S Hu, X Xie, Z Jin, M Geng, Y Wang, M Cui, J Deng, X Liu, H Meng ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 17 | 2023 |
Exploiting cross domain acoustic-to-articulatory inverted features for disordered speech recognition S Hu, S Liu, X Xie, M Geng, T Wang, S Hu, M Cui, X Liu, H Meng ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 15 | 2022 |
Personalized adversarial data augmentation for dysarthric and elderly speech recognition Z Jin, M Geng, J Deng, T Wang, S Hu, G Li, X Liu IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023 | 14 | 2023 |
Adversarial data augmentation using vae-gan for disordered speech recognition Z Jin, X Xie, M Geng, T Wang, S Hu, J Deng, G Li, X Liu ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 14 | 2023 |
Wavllm: Towards robust and adaptive speech large language model S Hu, L Zhou, S Liu, S Chen, H Hao, J Pan, X Liu, J Li, S Sivasankaran, ... arXiv preprint arXiv:2404.00656, 2024 | 9 | 2024 |
Boosting large language model for speech synthesis: An empirical study H Hao, L Zhou, S Liu, J Li, S Hu, R Wang, F Wei arXiv preprint arXiv:2401.00246, 2023 | 9 | 2023 |
Two-pass decoding and cross-adaptation based system combination of end-to-end conformer and hybrid tdnn asr systems M Cui, J Deng, S Hu, X Xie, T Wang, S Hu, M Geng, B Xue, X Liu, H Meng arXiv preprint arXiv:2206.11596, 2022 | 9 | 2022 |
Confidence score based speaker adaptation of conformer speech recognition systems J Deng, X Xie, T Wang, M Cui, B Xue, Z Jin, G Li, S Hu, X Liu IEEE/ACM Transactions on Audio, Speech, and Language Processing 31, 1175-1190, 2023 | 8 | 2023 |
Audio-visual end-to-end multi-channel speech separation, dereverberation and recognition G Li, J Deng, M Geng, Z Jin, T Wang, S Hu, M Cui, H Meng, X Liu IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023 | 7 | 2023 |
Exploiting cross-domain and cross-lingual ultrasound tongue imaging features for elderly and dysarthric speech recognition S Hu, X Xie, M Geng, M Cui, J Deng, G Li, T Wang, X Liu, H Meng arXiv preprint arXiv:2206.07327, 2022 | 7 | 2022 |
Enhancing Pre-trained ASR System Fine-tuning for Dysarthric Speech Recognition using Adversarial Data Augmentation H Wang, Z Jin, M Geng, S Hu, G Li, T Wang, H Xu, X Liu ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 6 | 2024 |
Use of speech impairment severity for dysarthric speech recognition M Geng, Z Jin, T Wang, S Hu, J Deng, M Cui, G Li, J Yu, X Xie, X Liu arXiv preprint arXiv:2305.10659, 2023 | 5 | 2023 |
On-the-Fly Feature Based Rapid Speaker Adaptation for Dysarthric and Elderly Speech Recognition M Geng, X Xie, R Su, J Yu, Z Jin, T Wang, S Hu, Z Ye, H Meng, X Liu arXiv preprint arXiv:2203.14593, 2022 | 3 | 2022 |
Autoregressive Speech Synthesis without Vector Quantization L Meng, L Zhou, S Liu, S Chen, B Han, S Hu, Y Liu, J Li, S Zhao, X Wu, ... arXiv preprint arXiv:2407.08551, 2024 | | 2024 |
Homogeneous Speaker Features for On-the-Fly Dysarthric and Elderly Speaker Adaptation M Geng, X Xie, J Deng, Z Jin, G Li, T Wang, S Hu, Z Li, H Meng, X Liu arXiv preprint arXiv:2407.06310, 2024 | | 2024 |
Self-supervised ASR Models and Features For Dysarthric and Elderly Speech Recognition S Hu, X Xie, M Geng, Z Jin, J Deng, G Li, Y Wang, M Cui, T Wang, H Meng, ... IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024 | | 2024 |
One-pass Multiple Conformer and Foundation Speech Systems Compression and Quantization Using An All-in-one Neural Model Z Li, H Xu, T Wang, S Hu, Z Jin, S Hu, J Deng, M Cui, M Geng, X Liu arXiv preprint arXiv:2406.10160, 2024 | | 2024 |
Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition G Li, J Deng, Y Chen, M Geng, S Hu, Z Li, Z Jin, T Wang, X Xie, H Meng, ... arXiv preprint arXiv:2406.10152, 2024 | | 2024 |
Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask T Wang, X Xie, Z Li, S Hu, Z Jing, J Deng, M Cui, S Hu, M Geng, G Li, ... arXiv preprint arXiv:2406.10034, 2024 | | 2024 |