End-to-end speech recognition: A survey
In the last decade of automatic speech recognition (ASR) research, the introduction of deep
learning has brought considerable reductions in word error rate of more than 50% relative …
learning has brought considerable reductions in word error rate of more than 50% relative …
Bigssl: Exploring the frontier of large-scale semi-supervised learning for automatic speech recognition
We summarize the results of a host of efforts using giant automatic speech recognition (ASR)
models pre-trained using large, diverse unlabeled datasets containing approximately a …
models pre-trained using large, diverse unlabeled datasets containing approximately a …
An accurate and rapidly calibrating speech neuroprosthesis
NS Card, M Wairagkar, C Iacobacci… - … England Journal of …, 2024 - Mass Medical Soc
Background Brain–computer interfaces can enable communication for people with paralysis
by transforming cortical activity associated with attempted speech into text on a computer …
by transforming cortical activity associated with attempted speech into text on a computer …
Diagonal state space augmented transformers for speech recognition
We improve on the popular conformer architecture by replacing the depthwise temporal
convolutions with diagonal state space (DSS) models. DSS is a recently introduced variant …
convolutions with diagonal state space (DSS) models. DSS is a recently introduced variant …
Speaker adaptation using spectro-temporal deep features for dysarthric and elderly speech recognition
Despite the rapid progress of automatic speech recognition (ASR) technologies targeting
normal speech in recent decades, accurate recognition of dysarthric and elderly speech …
normal speech in recent decades, accurate recognition of dysarthric and elderly speech …
VarArray: Array-geometry-agnostic continuous speech separation
Continuous speech separation using a microphone array was shown to be promising in
dealing with the speech overlap problem in natural conversation transcription. This paper …
dealing with the speech overlap problem in natural conversation transcription. This paper …
Bayesian neural network language modeling for speech recognition
State-of-the-art neural network language models (NNLMs) represented by long short term
memory recurrent neural networks (LSTM-RNNs) and Transformers are becoming highly …
memory recurrent neural networks (LSTM-RNNs) and Transformers are becoming highly …
Modular domain adaptation for conformer-based streaming asr
Speech data from different domains has distinct acoustic and linguistic characteristics. It is
common to train a single multidomain model such as a Conformer transducer for speech …
common to train a single multidomain model such as a Conformer transducer for speech …
Confidence score based speaker adaptation of conformer speech recognition systems
Speaker adaptation techniques provide a powerful solution to customise automatic speech
recognition (ASR) systems for individual users. Practical application of unsupervised model …
recognition (ASR) systems for individual users. Practical application of unsupervised model …
Efficient training of neural transducer for speech recognition
As one of the most popular sequence-to-sequence modeling approaches for speech
recognition, the RNN-Transducer has achieved evolving performance with more and more …
recognition, the RNN-Transducer has achieved evolving performance with more and more …