Adaptation algorithms for neural network-based speech recognition: An overview
We present a structured overview of adaptation algorithms for neural network-based speech
recognition, considering both hybrid hidden Markov model/neural network systems and end …
recognition, considering both hybrid hidden Markov model/neural network systems and end …
Salvaging federated learning by local adaptation
Federated learning (FL) is a heavily promoted approach for training ML models on sensitive
data, eg, text typed by users on their smartphones. FL is expressly designed for training on …
data, eg, text typed by users on their smartphones. FL is expressly designed for training on …
Large-scale multilingual speech recognition with a streaming end-to-end model
Multilingual end-to-end (E2E) models have shown great promise in expansion of automatic
speech recognition (ASR) coverage of the world's languages. They have shown …
speech recognition (ASR) coverage of the world's languages. They have shown …
Internal language model estimation for domain-adaptive end-to-end speech recognition
The external language models (LM) integration remains a challenging task for end-to-end
(E2E) automatic speech recognition (ASR) which has no clear division between acoustic …
(E2E) automatic speech recognition (ASR) which has no clear division between acoustic …
Recent progresses in deep learning based acoustic models
In this paper, we summarize recent progresses made in deep learning based acoustic
models and the motivation and insights behind the surveyed techniques. We first discuss …
models and the motivation and insights behind the surveyed techniques. We first discuss …
Multi-dialect speech recognition with a single sequence-to-sequence model
Sequence-to-sequence models provide a simple and elegant solution for building speech
recognition systems by folding separate components of a typical system, namely acoustic …
recognition systems by folding separate components of a typical system, namely acoustic …
Speaker-invariant training via adversarial learning
We propose a novel adversarial multi-task learning scheme, aiming at actively curtailing the
inter-talker feature variability while maximizing its senone discriminability so as to enhance …
inter-talker feature variability while maximizing its senone discriminability so as to enhance …
The accented english speech recognition challenge 2020: open datasets, tracks, baselines, results and methods
The variety of accents has posed a big challenge to speech recognition. The Accented
English Speech Recognition Challenge (AESRC2020) is designed for providing a common …
English Speech Recognition Challenge (AESRC2020) is designed for providing a common …
Learning hidden unit contributions for unsupervised acoustic model adaptation
This work presents a broad study on the adaptation of neural network acoustic models by
means of learning hidden unit contributions (LHUC)-a method that linearly re-combines …
means of learning hidden unit contributions (LHUC)-a method that linearly re-combines …
Internal language model training for domain-adaptive end-to-end speech recognition
The efficacy of external language model (LM) integration with existing end-to-end (E2E)
automatic speech recognition (ASR) systems can be improved significantly using the …
automatic speech recognition (ASR) systems can be improved significantly using the …