Speechlm: Enhanced speech pre-training with unpaired textual data Z Zhang, S Chen, L Zhou, Y Wu, S Ren, S Liu, Z Yao, X Gong, L Dai, J Li, ... IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024 | 42 | 2024 |
Layer-Wise Fast Adaptation for End-to-End Multi-Accent Speech Recognition Y Qian, X Gong, H Huang IEEE/ACM Transactions on Audio, Speech, and Language Processing 30 (DOI: 10 …, 2022 | 30 | 2022 |
Layer-Wise Fast Adaptation for End-to-End Multi-Accent Speech Recognition X Gong, Y Lu, Z Zhou, Y Qian Proc. Interspeech 2021, 1274-1278, 2021 | 26 | 2021 |
Text adaptation for speaker verification with speaker-text factorized embeddings Y Yang, S Wang, X Gong, Y Qian, K Yu ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 12 | 2020 |
Whisper-kdq: A lightweight whisper via guided knowledge distillation and quantization for efficient asr H Shao, W Wang, B Liu, X Gong, H Wang, Y Qian arXiv preprint arXiv:2305.10788, 2023 | 11 | 2023 |
Factorized aed: Factorized attention-based encoder-decoder for text-only domain adaptive asr X Gong, W Wang, H Shao, X Chen, Y Qian ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 8 | 2023 |
Longfnt: Long-form speech recognition with factorized neural transducer X Gong, Y Wu, J Li, S Liu, R Zhao, X Chen, Y Qian ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 8 | 2023 |
Knowledge Transfer and Distillation from Autoregressive to Non-Autoregressive Speech Recognition X Gong, Z Zhou, Y Qian Proc. Interspeech 2022, 2618--2622, 2022 | 7 | 2022 |
Advanced long-content speech recognition with factorized neural transducer X Gong, Y Wu, J Li, S Liu, R Zhao, X Chen, Y Qian IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024 | 4 | 2024 |
Speaker embedding augmentation with noise distribution matching X Gong, Z Chen, Y Yang, S Wang, L Wang, Y Qian 2021 12th International Symposium on Chinese Spoken Language Processing …, 2021 | 3 | 2021 |
End-to-End Texture-Aware and Depth-Aware Embedded Advertising for Videos J Li, X Gong, B Li Proceedings of the 2020 6th International Conference on Computer and …, 2020 | 1 | 2020 |
Contextual Biasing Speech Recognition in Speech-enhanced Large Language Model X Gong, A Lv, Z Wang, Y Qian Interspeech 2024, 257-261, 2024 | | 2024 |
Joint Discriminator and Transfer Based Fast Domain Adaptation For End-To-End Speech Recognition H Shao, T Tan, W Wang, X Gong, Y Qian ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | | 2023 |