[PDF][PDF] Recent advances in end-to-end automatic speech recognition
J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com
Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …
Conformer: Convolution-augmented transformer for speech recognition
Recently Transformer and Convolution neural network (CNN) based models have shown
promising results in Automatic Speech Recognition (ASR), outperforming Recurrent neural …
promising results in Automatic Speech Recognition (ASR), outperforming Recurrent neural …
Self-supervised learning with random-projection quantizer for speech recognition
We present a simple and effective self-supervised learning approach for speech recognition.
The approach learns a model to predict the masked speech signals, in the form of discrete …
The approach learns a model to predict the masked speech signals, in the form of discrete …
End-to-end speech recognition: A survey
In the last decade of automatic speech recognition (ASR) research, the introduction of deep
learning has brought considerable reductions in word error rate of more than 50% relative …
learning has brought considerable reductions in word error rate of more than 50% relative …
Contextnet: Improving convolutional neural networks for automatic speech recognition with global context
Convolutional neural networks (CNN) have shown promising results for end-to-end speech
recognition, albeit still behind other state-of-the-art methods in performance. In this paper …
recognition, albeit still behind other state-of-the-art methods in performance. In this paper …
Wenet 2.0: More productive end-to-end speech recognition toolkit
Recently, we made available WeNet, a production-oriented end-to-end speech recognition
toolkit, which introduces a unified two-pass (U2) framework and a built-in runtime to address …
toolkit, which introduces a unified two-pass (U2) framework and a built-in runtime to address …
A better and faster end-to-end model for streaming asr
End-to-end (E2E) models have shown to outperform state-of-the-art conventional models for
streaming speech recognition [1] across many dimensions, including quality (as measured …
streaming speech recognition [1] across many dimensions, including quality (as measured …
On the comparison of popular end-to-end models for large scale speech recognition
Recently, there has been a strong push to transition from hybrid models to end-to-end (E2E)
models for automatic speech recognition. Currently, there are three promising E2E methods …
models for automatic speech recognition. Currently, there are three promising E2E methods …
Internal language model estimation for domain-adaptive end-to-end speech recognition
The external language models (LM) integration remains a challenging task for end-to-end
(E2E) automatic speech recognition (ASR) which has no clear division between acoustic …
(E2E) automatic speech recognition (ASR) which has no clear division between acoustic …
Developing RNN-T models surpassing high-performance hybrid models with customization capability
Because of its streaming nature, recurrent neural network transducer (RNN-T) is a very
promising end-to-end (E2E) model that may replace the popular hybrid model for automatic …
promising end-to-end (E2E) model that may replace the popular hybrid model for automatic …