[PDF][PDF] Recent advances in end-to-end automatic speech recognition
J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com
Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …
End-to-end speech recognition: A survey
In the last decade of automatic speech recognition (ASR) research, the introduction of deep
learning has brought considerable reductions in word error rate of more than 50% relative …
learning has brought considerable reductions in word error rate of more than 50% relative …
Scaling end-to-end models for large-scale multilingual asr
Building ASR models across many languages is a challenging multi-task learning problem
due to large variations and heavily unbalanced data. Existing work has shown positive …
due to large variations and heavily unbalanced data. Existing work has shown positive …
Massively multilingual asr: A lifelong learning solution
The development of end-to-end models has largely sped up the research in massively
multilingual automatic speech recognition (MMASR). Previous research has demonstrated …
multilingual automatic speech recognition (MMASR). Previous research has demonstrated …
Residual adapters for parameter-efficient ASR adaptation to atypical and accented speech
Automatic Speech Recognition (ASR) systems are often optimized to work best for speakers
with canonical speech patterns. Unfortunately, these systems perform poorly when tested on …
with canonical speech patterns. Unfortunately, these systems perform poorly when tested on …
[HTML][HTML] Traditional machine learning models and bidirectional encoder representations from transformer (BERT)–based automatic classification of tweets about …
JA Benítez-Andrades, JM Alija-Pérez… - JMIR medical …, 2022 - medinform.jmir.org
Background Eating disorders affect an increasing number of people. Social networks
provide information that can help. Objective We aimed to find machine learning models …
provide information that can help. Objective We aimed to find machine learning models …
Diagonal state space augmented transformers for speech recognition
We improve on the popular conformer architecture by replacing the depthwise temporal
convolutions with diagonal state space (DSS) models. DSS is a recently introduced variant …
convolutions with diagonal state space (DSS) models. DSS is a recently introduced variant …
On the limit of english conversational speech recognition
In our previous work we demonstrated that a single headed attention encoder-decoder
model is able to reach state-of-the-art results in conversational speech recognition. In this …
model is able to reach state-of-the-art results in conversational speech recognition. In this …
Large-scale streaming end-to-end speech translation with neural transducers
Neural transducers have been widely used in automatic speech recognition (ASR). In this
paper, we introduce it to streaming end-to-end speech translation (ST), which aims to …
paper, we introduce it to streaming end-to-end speech translation (ST), which aims to …
Integrating text inputs for training and adapting rnn transducer asr models
Compared to hybrid automatic speech recognition (ASR) systems that use a modular
architecture in which each component can be in-dependently adapted to a new domain …
architecture in which each component can be in-dependently adapted to a new domain …