State-of-the-art speech recognition with sequence-to-sequence models CC Chiu, TN Sainath, Y Wu, R Prabhavalkar, P Nguyen, Z Chen, ... 2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018 | 1421 | 2018 |
Deep Learning for Audio Signal Processing H Purwins, B Li, T Virtanen, J Schlüter, SY Chang, T Sainath IEEE Journal of Selected Topics in Signal Processing 13 (2), 206-219, 2019 | 795 | 2019 |
Streaming End-to-end Speech Recognition for Mobile Devices Y He, TN Sainath, R Prabhavalkar, I McGraw, R Alvarez, D Zhao, ... ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019 | 703 | 2019 |
A Comparison of Sequence-to-Sequence Models for Speech Recognition R Prabhavalkar, K Rao, TN Sainath, B Li, L Johnson, N Jaitly Proc. Interspeech 2017, 939-943, 2017 | 379 | 2017 |
Multilingual speech recognition with a single end-to-end model S Toshniwal, TN Sainath, RJ Weiss, B Li, P Moreno, E Weinstein, K Rao 2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018 | 278 | 2018 |
Exploring speech enhancement with generative adversarial networks for robust speech recognition C Donahue, B Li, R Prabhavalkar 2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018 | 266 | 2018 |
Multichannel Signal Processing With Deep Neural Networks for Automatic Speech Recognition TN Sainath, RJ Weiss, KW Wilson, B Li, A Narayanan, E Variani, ... IEEE/ACM Transactions on Audio, Speech, and Language Processing 25 (5), 965-979, 2017 | 265 | 2017 |
Improved Noisy Student Training for Automatic Speech Recognition DS Park, Y Zhang, Y Jia, W Han, CC Chiu, B Li, Y Wu, QV Le arXiv preprint arXiv:2005.09629, 2020 | 246 | 2020 |
A streaming on-device end-to-end model surpassing server-side conventional model quality and latency TN Sainath, Y He, B Li, A Narayanan, R Pang, A Bruguier, S Chang, W Li, ... ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 220 | 2020 |
Comparison of discriminative input and output transformations for speaker adaptation in the hybrid NN/HMM systems B Li, KC Sim INTERSPEECH, 526-529, 2010 | 207 | 2010 |
Lingvo: a modular and scalable framework for sequence-to-sequence modeling J Shen, P Nguyen, Y Wu, Z Chen, MX Chen, Y Jia, A Kannan, T Sainath, ... arXiv preprint arXiv:1902.08295, 2019 | 202 | 2019 |
Acoustic Modeling for Google Home B Li, T Sainath, A Narayanan, J Caroselli, M Bacchiani, A Misra, I Shafran, ... INTERSPEECH-2017, 2017 | 202 | 2017 |
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages Y Zhang, W Han, J Qin, Y Wang, A Bapna, Z Chen, N Chen, B Li, ... arXiv preprint arXiv:2303.01037, 2023 | 164 | 2023 |
Bigssl: Exploring the frontier of large-scale semi-supervised learning for automatic speech recognition Y Zhang, DS Park, W Han, J Qin, A Gulati, J Shor, A Jansen, Y Xu, ... IEEE Journal of Selected Topics in Signal Processing 16 (6), 1519-1532, 2022 | 159 | 2022 |
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context M Reid, N Savinov, D Teplyashin, D Lepikhin, T Lillicrap, J Alayrac, ... arXiv preprint arXiv:2403.05530, 2024 | 154 | 2024 |
Bytes are All You Need: End-to-End Multilingual Speech Recognition and Synthesis with Bytes B Li, Y Zhang, T Sainath, Y Wu, W Chan ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019 | 151 | 2019 |
Shallow-fusion end-to-end contextual biasing D Zhao, TN Sainath, D Rybach, P Rondon, D Bhatia, B Li, R Pang Submitted to Interspeech 2019, 2019 | 150 | 2019 |
Specaugment on large scale datasets DS Park, Y Zhang, CC Chiu, Y Chen, B Li, W Chan, QV Le, Y Wu ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 149 | 2020 |
Neural Network Adaptive Beamforming for Robust Multichannel Speech Recognition. B Li, TN Sainath, RJ Weiss, KW Wilson, M Bacchiani INTERSPEECH, 1976-1980, 2016 | 145 | 2016 |
Multi-dialect speech recognition with a single sequence-to-sequence model B Li, TN Sainath, KC Sim, M Bacchiani, E Weinstein, P Nguyen, Z Chen, ... 2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018 | 136 | 2018 |