ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet S Arora, S Dalmia, P Denisov, X Chang, Y Ueda, Y Peng, Y Zhang, ... ICASSP 2022, 2022 | 70 | 2022 |
Prompting the hidden talent of web-scale speech models for zero-shot task generalization P Peng, B Yan, S Watanabe, D Harwath arXiv preprint arXiv:2305.11095, 2023 | 31 | 2023 |
Searchable hidden intermediates for end-to-end models of decomposable sequence tasks S Dalmia, B Yan, V Raunak, F Metze, S Watanabe NAACL 2021, 2021 | 30 | 2021 |
Improving massively multilingual asr with auxiliary ctc objectives W Chen, B Yan, J Shi, Y Peng, S Maiti, S Watanabe ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 29 | 2023 |
Exploration of efficient end-to-end asr using discretized input from self-supervised learning X Chang, B Yan, Y Fujita, T Maekaku, S Watanabe arXiv preprint arXiv:2305.18108, 2023 | 28 | 2023 |
CTC Alignments Improve Autoregressive Translation B Yan, S Dalmia, Y Higuchi, G Neubig, F Metze, AW Black, S Watanabe EACL 2023, 2022 | 28 | 2022 |
Reproducing whisper-style training using an open-source toolkit and publicly available data Y Peng, J Tian, B Yan, D Berrebbi, X Chang, X Li, J Shi, S Arora, W Chen, ... 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-8, 2023 | 22 | 2023 |
BERT meets CTC: New formulation of end-to-end speech recognition with pre-trained masked language model Y Higuchi, B Yan, S Arora, T Ogawa, T Kobayashi, S Watanabe EMNLP 2022, 2022 | 22 | 2022 |
ESPnet-SE++: Speech enhancement for robust speech recognition, translation, and understanding YJ Lu, X Chang, C Li, W Zhang, S Cornell, Z Ni, Y Masuyama, B Yan, ... arXiv preprint arXiv:2207.09514, 2022 | 22 | 2022 |
ESPnet-ST IWSLT 2021 Offline Speech Translation System H Inaguma, B Yan, S Dalmia, P Gu, J Shi, K Duh, S Watanabe IWSLT 2021, 2021 | 20 | 2021 |
Combining spectral and self-supervised features for low resource speech recognition and translation D Berrebbi, J Shi, B Yan, O López-Francisco, JD Amith, S Watanabe arXiv preprint arXiv:2204.02470, 2022 | 19 | 2022 |
Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization B Yan, C Zhang, M Yu, SX Zhang, S Dalmia, D Berrebbi, C Weng, ... ICASSP 2022, 2022 | 17 | 2022 |
Exploring speech recognition, translation, and understanding with discrete speech units: A comparative study X Chang, B Yan, K Choi, JW Jung, Y Lu, S Maiti, R Sharma, J Shi, J Tian, ... ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 15 | 2024 |
4D ASR: Joint modeling of CTC, attention, transducer, and mask-predict decoders Y Sudo, M Shakeel, B Yan, J Shi, S Watanabe arXiv preprint arXiv:2212.10818, 2022 | 15 | 2022 |
Two-pass low latency end-to-end spoken language understanding S Arora, S Dalmia, X Chang, B Yan, A Black, S Watanabe arXiv preprint arXiv:2207.06670, 2022 | 14 | 2022 |
OWSM v3. 1: Better and faster open whisper-style speech models based on e-branchformer Y Peng, J Tian, W Chen, S Arora, B Yan, Y Sudo, M Shakeel, K Choi, ... arXiv preprint arXiv:2401.16658, 2024 | 13 | 2024 |
Token-level sequence labeling for spoken language understanding using compositional end-to-end models S Arora, S Dalmia, B Yan, F Metze, AW Black, S Watanabe arXiv preprint arXiv:2210.15734, 2022 | 13 | 2022 |
Fast-MD: Fast Multi-Decoder End-to-End Speech Translation with Non-Autoregressive Hidden Intermediates H Inaguma, S Dalmia, B Yan, S Watanabe ASRU 2021, 2021 | 12 | 2021 |
Differentiable Allophone Graphs for Language-Universal Speech Recognition B Yan, S Dalmia, DR Mortensen, F Metze, S Watanabe INTERSPEECH 2021, 2021 | 12 | 2021 |
Highland puebla nahuatl speech translation corpus for endangered language documentation J Shi, JD Amith, X Chang, S Dalmia, B Yan, S Watanabe Proceedings of the First Workshop on Natural Language Processing for …, 2021 | 12 | 2021 |