Contextual RNN-T for open domain ASR

J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com

Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …

被引用次数：325 相关文章所有 7 个版本

[PDF] ieee.org

Adaptation algorithms for neural network-based speech recognition: An overview

P Bell, J Fainberg, O Klejch, J Li… - IEEE Open Journal …, 2020 - ieeexplore.ieee.org

We present a structured overview of adaptation algorithms for neural network-based speech
recognition, considering both hybrid hidden Markov model/neural network systems and end …

被引用次数：87 相关文章所有 7 个版本

[PDF] arxiv.org

Wenet 2.0: More productive end-to-end speech recognition toolkit

B Zhang, D Wu, Z Peng, X Song, Z Yao, H Lv… - arXiv preprint arXiv …, 2022 - arxiv.org

Recently, we made available WeNet, a production-oriented end-to-end speech recognition
toolkit, which introduces a unified two-pass (U2) framework and a built-in runtime to address …

被引用次数：67 相关文章所有 2 个版本

[PDF] arxiv.org

Contextual adapters for personalized speech recognition in neural transducers

KM Sathyendra, T Muniyappa, FJ Chang… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

Personal rare word recognition in end-to-end Automatic Speech Recognition (E2E ASR)
models is a challenge due to the lack of training data. A standard way to address this issue …

被引用次数：61 相关文章所有 4 个版本

[PDF] arxiv.org

Context-aware transformer transducer for speech recognition

FJ Chang, J Liu, M Radfar, A Mouchtaris… - 2021 IEEE Automatic …, 2021 - ieeexplore.ieee.org

End-to-end (E2E) automatic speech recognition (ASR) systems often have difficulty
recognizing uncommon words, that appear infrequently in the training data. One promising …

被引用次数：71 相关文章所有 6 个版本

[PDF] arxiv.org

Contextualized streaming end-to-end speech recognition with trie-based deep biasing and shallow fusion

D Le, M Jain, G Keren, S Kim, Y Shi… - arXiv preprint arXiv …, 2021 - arxiv.org

How to leverage dynamic contextual information in end-to-end speech recognition has
remained an active research area. Previous solutions to this problem were either designed …

被引用次数：73 相关文章所有 5 个版本

[PDF] arxiv.org

Deep shallow fusion for RNN-T personalization

D Le, G Keren, J Chan, J Mahadeokar… - 2021 IEEE Spoken …, 2021 - ieeexplore.ieee.org

End-to-end models in general, and Recurrent Neural Network Transducer (RNN-T) in
particular, have gained significant traction in the automatic speech recognition community in …

被引用次数：73 相关文章所有 3 个版本

[PDF] arxiv.org

Improving end-to-end contextual speech recognition with fine-grained contextual knowledge selection

M Han, L Dong, Z Liang, M Cai, S Zhou… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

Nowadays, most methods for end-to-end contextual speech recognition bias the recognition
process towards contextual knowledge. Since all-neural contextual biasing methods rely on …

被引用次数：34 相关文章所有 3 个版本

[PDF] amazon.science

Personalization of ctc speech recognition models

S Dingliwal, M Sunkara, S Ronanki… - 2022 IEEE Spoken …, 2023 - ieeexplore.ieee.org

End-to-end speech recognition models trained using joint Connectionist Temporal
Classification (CTC)-Attention loss have gained popularity recently. In these models, a non …

被引用次数：25 相关文章所有 3 个版本

Nam+: Towards scalable end-to-end contextual biasing for adaptive asr

T Munkhdalai, Z Wu, G Pundak, KC Sim… - 2022 IEEE Spoken …, 2023 - ieeexplore.ieee.org

Attention-based biasing techniques for end-to-end ASR systems are able to achieve large
accuracy gains without requiring the inference algorithm adjustments and parameter tuning …

被引用次数：16 相关文章