A comparative study on transformer vs rnn in speech applications S Karita, N Chen, T Hayashi, T Hori, H Inaguma, Z Jiang, M Someki, ... 2019 IEEE automatic speech recognition and understanding workshop (ASRU …, 2019 | 785 | 2019 |
Serialized output training for end-to-end overlapped speech recognition N Kanda, Y Gaur, X Wang, Z Meng, T Yoshioka arXiv preprint arXiv:2003.12687, 2020 | 96 | 2020 |
Joint speaker counting, speech recognition, and speaker identification for overlapped speech of any number of speakers N Kanda, Y Gaur, X Wang, Z Meng, Z Chen, T Zhou, T Yoshioka arXiv preprint arXiv:2006.10930, 2020 | 70 | 2020 |
Speech enhancement using end-to-end speech recognition objectives AS Subramanian, X Wang, MK Baskar, S Watanabe, T Taniguchi, D Tran, ... 2019 IEEE Workshop on Applications of Signal Processing to Audio and …, 2019 | 63 | 2019 |
Personalized speech enhancement: New models and comprehensive evaluation SE Eskimez, T Yoshioka, H Wang, X Wang, Z Chen, X Huang ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 57 | 2022 |
The Hitachi/JHU CHiME-5 system: Advances in speech recognition for everyday home environments using multiple microphone arrays N Kanda, R Ikeshita, S Horiguchi, Y Fujita, K Nagamatsu, X Wang, ... Proc. CHiME-5, 6-10, 2018 | 54 | 2018 |
Investigation of end-to-end speaker-attributed ASR for continuous multi-talker recordings N Kanda, X Chang, Y Gaur, X Wang, Z Meng, Z Chen, T Yoshioka 2021 IEEE Spoken Language Technology Workshop (SLT), 809-816, 2021 | 42 | 2021 |
Streaming multi-talker ASR with token-level serialized output training N Kanda, J Wu, Y Wu, X Xiao, Z Meng, X Wang, Y Gaur, Z Chen, J Li, ... Interspeech 2022, 3774-3778, 2022 | 39 | 2022 |
End-to-end speaker-attributed ASR with transformer N Kanda, G Ye, Y Gaur, X Wang, Z Meng, Z Chen, T Yoshioka Interspeech 2021, 4413-4417, 2021 | 36 | 2021 |
Large-scale pre-training of end-to-end multi-talker ASR for meeting transcription with single distant microphone N Kanda, G Ye, Y Wu, Y Gaur, X Wang, Z Meng, Z Chen, T Yoshioka Interspeech 2021, 3430-3434, 2021 | 34 | 2021 |
Speechx: Neural codec language model as a versatile speech transformer X Wang, M Thakker, Z Chen, N Kanda, SE Eskimez, S Chen, M Tang, ... arXiv preprint arXiv:2308.06873, 2023 | 33 | 2023 |
VarArray: Array-geometry-agnostic continuous speech separation T Yoshioka, X Wang, D Wang, M Tang, Z Zhu, Z Chen, N Kanda ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 28 | 2022 |
Improving noise robustness of contrastive speech representation learning with speech reconstruction H Wang, Y Qian, X Wang, Y Wang, C Wang, S Liu, T Yoshioka, J Li, ... ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 28 | 2022 |
Stream attention-based multi-array end-to-end speech recognition X Wang, R Li, SH Mallidi, T Hori, S Watanabe, H Hermansky ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019 | 27 | 2019 |
An investigation of end-to-end multichannel speech recognition for reverberant and mismatch conditions AS Subramanian, X Wang, S Watanabe, T Taniguchi, D Tran, Y Fujita arXiv preprint arXiv:1904.09049, 2019 | 27 | 2019 |
Human listening and live captioning: Multi-task training for speech enhancement SE Eskimez, X Wang, M Tang, H Yang, Z Zhu, Z Chen, H Wang, ... Interspeech 2021, 2686-2690, 2021 | 25 | 2021 |
Oracle performance investigation of the ideal masks Z Wang, X Wang, X Li, Q Fu, Y Yan 2016 IEEE International Workshop on Acoustic Signal Enhancement (IWAENC), 1-5, 2016 | 24 | 2016 |
Transcribe-to-diarize: Neural speaker diarization for unlimited number of speakers using end-to-end speaker-attributed ASR N Kanda, X Xiao, Y Gaur, X Wang, Z Meng, Z Chen, T Yoshioka ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 23 | 2022 |
Multi-stream end-to-end speech recognition R Li, X Wang, SH Mallidi, S Watanabe, T Hori, H Hermansky IEEE/ACM Transactions on Audio, Speech, and Language Processing 28, 646-655, 2019 | 23 | 2019 |
Continuous speech separation with ad hoc microphone arrays D Wang, T Yoshioka, Z Chen, X Wang, T Zhou, Z Meng 2021 29th European Signal Processing Conference (EUSIPCO), 1100-1104, 2021 | 21 | 2021 |