Adaptation algorithms for neural network-based speech recognition: An overview

P Bell, J Fainberg, O Klejch, J Li… - IEEE Open Journal …, 2020 - ieeexplore.ieee.org
We present a structured overview of adaptation algorithms for neural network-based speech
recognition, considering both hybrid hidden Markov model/neural network systems and end …

Salvaging federated learning by local adaptation

T Yu, E Bagdasaryan, V Shmatikov - arXiv preprint arXiv:2002.04758, 2020 - arxiv.org
Federated learning (FL) is a heavily promoted approach for training ML models on sensitive
data, eg, text typed by users on their smartphones. FL is expressly designed for training on …

Large-scale multilingual speech recognition with a streaming end-to-end model

A Kannan, A Datta, TN Sainath, E Weinstein… - arXiv preprint arXiv …, 2019 - arxiv.org
Multilingual end-to-end (E2E) models have shown great promise in expansion of automatic
speech recognition (ASR) coverage of the world's languages. They have shown …

Internal language model estimation for domain-adaptive end-to-end speech recognition

Z Meng, S Parthasarathy, E Sun, Y Gaur… - 2021 IEEE Spoken …, 2021 - ieeexplore.ieee.org
The external language models (LM) integration remains a challenging task for end-to-end
(E2E) automatic speech recognition (ASR) which has no clear division between acoustic …

Recent progresses in deep learning based acoustic models

D Yu, J Li - IEEE/CAA Journal of automatica sinica, 2017 - ieeexplore.ieee.org
In this paper, we summarize recent progresses made in deep learning based acoustic
models and the motivation and insights behind the surveyed techniques. We first discuss …

Multi-dialect speech recognition with a single sequence-to-sequence model

B Li, TN Sainath, KC Sim, M Bacchiani… - … on acoustics, speech …, 2018 - ieeexplore.ieee.org
Sequence-to-sequence models provide a simple and elegant solution for building speech
recognition systems by folding separate components of a typical system, namely acoustic …

Speaker-invariant training via adversarial learning

Z Meng, J Li, Z Chen, Y Zhao, V Mazalov… - … , Speech and Signal …, 2018 - ieeexplore.ieee.org
We propose a novel adversarial multi-task learning scheme, aiming at actively curtailing the
inter-talker feature variability while maximizing its senone discriminability so as to enhance …

The accented english speech recognition challenge 2020: open datasets, tracks, baselines, results and methods

X Shi, F Yu, Y Lu, Y Liang, Q Feng… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
The variety of accents has posed a big challenge to speech recognition. The Accented
English Speech Recognition Challenge (AESRC2020) is designed for providing a common …

Learning hidden unit contributions for unsupervised acoustic model adaptation

P Swietojanski, J Li, S Renals - IEEE/ACM Transactions on …, 2016 - ieeexplore.ieee.org
This work presents a broad study on the adaptation of neural network acoustic models by
means of learning hidden unit contributions (LHUC)-a method that linearly re-combines …

Internal language model training for domain-adaptive end-to-end speech recognition

Z Meng, N Kanda, Y Gaur… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
The efficacy of external language model (LM) integration with existing end-to-end (E2E)
automatic speech recognition (ASR) systems can be improved significantly using the …