Superb: Speech processing universal performance benchmark S Yang, PH Chi, YS Chuang, CIJ Lai, K Lakhotia, YY Lin, AT Liu, J Shi, ... arXiv preprint arXiv:2105.01051, 2021 | 719 | 2021 |
CHiME-6 challenge: Tackling multispeaker speech recognition for unsegmented recordings S Watanabe, M Mandel, J Barker, E Vincent, A Arora, X Chang, ... arXiv preprint arXiv:2004.09249, 2020 | 298 | 2020 |
Recent developments on espnet toolkit boosted by conformer P Guo, F Boyer, X Chang, T Hayashi, Y Higuchi, H Inaguma, N Kamo, C Li, ... ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 271 | 2021 |
MIMO-Speech: End-to-end multi-channel multi-speaker speech recognition X Chang, W Zhang, Y Qian, J Le Roux, S Watanabe 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2019 | 115 | 2019 |
Audiogpt: Understanding and generating speech, music, sound, and talking head R Huang, M Li, D Yang, J Shi, X Chang, Z Ye, Y Wu, Z Hong, J Huang, ... Proceedings of the AAAI Conference on Artificial Intelligence 38 (21), 23802 …, 2024 | 104 | 2024 |
End-to-end multi-speaker speech recognition with transformer X Chang, W Zhang, Y Qian, J Le Roux, S Watanabe ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 101 | 2020 |
Recognizing multi-talker speech with permutation invariant training D Yu, X Chang, Y Qian arXiv preprint arXiv:1704.01985, 2017 | 100 | 2017 |
Past review, current progress, and challenges ahead on the cocktail party problem Y Qian, C Weng, X Chang, S Wang, D Yu Frontiers of Information Technology & Electronic Engineering 19, 40-63, 2018 | 92 | 2018 |
Single-channel multi-talker speech recognition with permutation invariant training Y Qian, X Chang, D Yu Speech Communication 104, 1-11, 2018 | 82 | 2018 |
End-to-end monaural multi-speaker ASR system without pretraining X Chang, Y Qian, K Yu, S Watanabe ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019 | 81 | 2019 |
SUPERB-SG: Enhanced speech processing universal PERformance benchmark for semantic and generative capabilities HS Tsai, HJ Chang, WC Huang, Z Huang, K Lakhotia, S Yang, S Dong, ... arXiv preprint arXiv:2203.06849, 2022 | 78 | 2022 |
An exploration of self-supervised pretrained representations for end-to-end speech recognition X Chang, T Maekaku, P Guo, J Shi, YJ Lu, AS Subramanian, T Wang, ... 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2021 | 74 | 2021 |
ESPnet-SE: End-to-end speech enhancement and separation toolkit designed for ASR integration C Li, J Shi, W Zhang, AS Subramanian, X Chang, N Kamo, M Hira, ... 2021 IEEE Spoken Language Technology Workshop (SLT), 785-792, 2021 | 72 | 2021 |
Espnet-slu: Advancing spoken language understanding through espnet S Arora, S Dalmia, P Denisov, X Chang, Y Ueda, Y Peng, Y Zhang, ... ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 67 | 2022 |
The 2020 espnet update: new features, broadened applications, performance improvements, and future plans S Watanabe, F Boyer, X Chang, P Guo, T Hayashi, Y Higuchi, T Hori, ... 2021 IEEE Data Science and Learning Workshop (DSLW), 1-6, 2021 | 53 | 2021 |
Unrestricted Vocabulary Keyword Spotting Using LSTM-CTC. Y Zhuang, X Chang, Y Qian, K Yu Interspeech, 938-942, 2016 | 50 | 2016 |
End-to-end integration of speech recognition, speech enhancement, and self-supervised learning representation X Chang, T Maekaku, Y Fujita, S Watanabe arXiv preprint arXiv:2204.00540, 2022 | 45 | 2022 |
Investigation of end-to-end speaker-attributed ASR for continuous multi-talker recordings N Kanda, X Chang, Y Gaur, X Wang, Z Meng, Z Chen, T Yoshioka 2021 IEEE Spoken Language Technology Workshop (SLT), 809-816, 2021 | 42 | 2021 |
Uniaudio: An audio foundation model toward universal audio generation D Yang, J Tian, X Tan, R Huang, S Liu, X Chang, J Shi, S Zhao, J Bian, ... arXiv preprint arXiv:2310.00704, 2023 | 38 | 2023 |
Insertion-based modeling for end-to-end automatic speech recognition Y Fujita, S Watanabe, M Omachi, X Chan arXiv preprint arXiv:2005.13211, 2020 | 37 | 2020 |