[PDF][PDF] Recent advances in end-to-end automatic speech recognition
J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com
Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …
Adaptation algorithms for neural network-based speech recognition: An overview
We present a structured overview of adaptation algorithms for neural network-based speech
recognition, considering both hybrid hidden Markov model/neural network systems and end …
recognition, considering both hybrid hidden Markov model/neural network systems and end …
Wenet 2.0: More productive end-to-end speech recognition toolkit
Recently, we made available WeNet, a production-oriented end-to-end speech recognition
toolkit, which introduces a unified two-pass (U2) framework and a built-in runtime to address …
toolkit, which introduces a unified two-pass (U2) framework and a built-in runtime to address …
Contextual adapters for personalized speech recognition in neural transducers
Personal rare word recognition in end-to-end Automatic Speech Recognition (E2E ASR)
models is a challenge due to the lack of training data. A standard way to address this issue …
models is a challenge due to the lack of training data. A standard way to address this issue …
Context-aware transformer transducer for speech recognition
End-to-end (E2E) automatic speech recognition (ASR) systems often have difficulty
recognizing uncommon words, that appear infrequently in the training data. One promising …
recognizing uncommon words, that appear infrequently in the training data. One promising …
Contextualized streaming end-to-end speech recognition with trie-based deep biasing and shallow fusion
How to leverage dynamic contextual information in end-to-end speech recognition has
remained an active research area. Previous solutions to this problem were either designed …
remained an active research area. Previous solutions to this problem were either designed …
Deep shallow fusion for RNN-T personalization
End-to-end models in general, and Recurrent Neural Network Transducer (RNN-T) in
particular, have gained significant traction in the automatic speech recognition community in …
particular, have gained significant traction in the automatic speech recognition community in …
Improving end-to-end contextual speech recognition with fine-grained contextual knowledge selection
Nowadays, most methods for end-to-end contextual speech recognition bias the recognition
process towards contextual knowledge. Since all-neural contextual biasing methods rely on …
process towards contextual knowledge. Since all-neural contextual biasing methods rely on …
Personalization of ctc speech recognition models
End-to-end speech recognition models trained using joint Connectionist Temporal
Classification (CTC)-Attention loss have gained popularity recently. In these models, a non …
Classification (CTC)-Attention loss have gained popularity recently. In these models, a non …
Nam+: Towards scalable end-to-end contextual biasing for adaptive asr
Attention-based biasing techniques for end-to-end ASR systems are able to achieve large
accuracy gains without requiring the inference algorithm adjustments and parameter tuning …
accuracy gains without requiring the inference algorithm adjustments and parameter tuning …