Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ... arXiv preprint arXiv:2312.11805, 2023 | 1191 | 2023 |
Deep learning for audio signal processing H Purwins, B Li, T Virtanen, J Schlüter, SY Chang, T Sainath IEEE Journal of Selected Topics in Signal Processing 13 (2), 206-219, 2019 | 819 | 2019 |
Streaming end-to-end speech recognition for mobile devices Y He, TN Sainath, R Prabhavalkar, I McGraw, R Alvarez, D Zhao, ... ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019 | 722 | 2019 |
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context M Reid, N Savinov, D Teplyashin, D Lepikhin, T Lillicrap, J Alayrac, ... arXiv preprint arXiv:2403.05530, 2024 | 235 | 2024 |
A streaming on-device end-to-end model surpassing server-side conventional model quality and latency TN Sainath, Y He, B Li, A Narayanan, R Pang, A Bruguier, S Chang, W Li, ... ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 223 | 2020 |
Towards fast and accurate streaming end-to-end ASR B Li, S Chang, TN Sainath, R Pang, Y He, T Strohman, Y Wu ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 131 | 2020 |
A better and faster end-to-end model for streaming asr B Li, A Gulati, J Yu, TN Sainath, CC Chiu, A Narayanan, SY Chang, ... ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 123 | 2021 |
Robust CNN-based speech recognition with Gabor filter kernels. SY Chang, N Morgan Interspeech, 905-909, 2014 | 100 | 2014 |
Fastemit: Low-latency streaming asr with sequence-level emission regularization J Yu, CC Chiu, B Li, S Chang, TN Sainath, Y He, A Narayanan, W Han, ... ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 93 | 2021 |
Personal VAD: Speaker-conditioned voice activity detection S Ding, Q Wang, S Chang, L Wan, IL Moreno arXiv preprint arXiv:1908.04284, 2019 | 85 | 2019 |
Temporal modeling using dilated convolution and gating for voice-activity-detection SY Chang, B Li, G Simko, TN Sainath, A Tripathi, A van den Oord, ... 2018 IEEE international conference on acoustics, speech and signal …, 2018 | 82 | 2018 |
Improved End-of-Query Detection for Streaming Speech Recognition. M Shannon, G Simko, SY Chang, C Parada Interspeech, 1909-1913, 2017 | 50 | 2017 |
An Efficient Streaming Non-Recurrent On-Device End-to-End Model with Improvements to Rare-Word Modeling. TN Sainath, Y He, A Narayanan, R Botros, R Pang, D Rybach, C Allauzen, ... Interspeech 8, 1777-1781, 2021 | 44 | 2021 |
Joint endpointing and decoding with end-to-end models SY Chang, R Prabhavalkar, Y He, TN Sainath, G Simko ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019 | 43 | 2019 |
Endpoint Detection Using Grid Long Short-Term Memory Networks for Streaming Speech Recognition. SY Chang, B Li, TN Sainath, G Simko, C Parada Interspeech, 3812-3816, 2017 | 36 | 2017 |
Improving the latency and quality of cascaded encoders TN Sainath, Y He, A Narayanan, R Botros, W Wang, D Qiu, CC Chiu, ... ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 26 | 2022 |
Streaming end-to-end multilingual speech recognition with joint language identification C Zhang, B Li, T Sainath, T Strohman, S Mavandadi, S Chang, P Haghani arXiv preprint arXiv:2209.06058, 2022 | 24 | 2022 |
Spectro-temporal features for noise-robust speech recognition using power-law nonlinearity and power-bias subtraction SY Chang, BT Meyer, N Morgan 2013 IEEE international conference on acoustics, speech and signal …, 2013 | 22 | 2013 |
The blame game in meeting room ASR: An analysis of feature versus model errors in noisy and mismatched conditions SHK Parthasarathi, SY Chang, J Cohen, N Morgan, S Wegmann 2013 IEEE International Conference on Acoustics, Speech and Signal …, 2013 | 20 | 2013 |
Turn-taking prediction for natural conversational speech S Chang, B Li, TN Sainath, C Zhang, T Strohman, Q Liang, Y He arXiv preprint arXiv:2208.13321, 2022 | 18 | 2022 |