Deep contextualized acoustic representations for semi-supervised speech recognition
We propose a novel approach to semi-supervised automatic speech recognition (ASR). We
first exploit a large amount of unlabeled audio data via representation learning, where we …
first exploit a large amount of unlabeled audio data via representation learning, where we …
Magic dust for cross-lingual adaptation of monolingual wav2vec-2.0
We propose a simple and effective cross-lingual transfer learning method to adapt
monolingual wav2vec-2.0 models for Automatic Speech Recognition (ASR) in resource …
monolingual wav2vec-2.0 models for Automatic Speech Recognition (ASR) in resource …
Semi-supervised speech recognition via graph-based temporal classification
Semi-supervised learning has demonstrated promising results in automatic speech
recognition (ASR) by self-training using a seed ASR model with pseudo-labels generated for …
recognition (ASR) by self-training using a seed ASR model with pseudo-labels generated for …
Semi-supervised learning with data augmentation for end-to-end ASR
F Weninger, F Mana, R Gemello… - arXiv preprint arXiv …, 2020 - arxiv.org
In this paper, we apply Semi-Supervised Learning (SSL) along with Data Augmentation (DA)
for improving the accuracy of End-to-End ASR. We focus on the consistency regularization …
for improving the accuracy of End-to-End ASR. We focus on the consistency regularization …
A semi-supervised complementary joint training approach for low-resource speech recognition
Both unpaired speech and text have shown to be beneficial for low-resource automatic
speech recognition (ASR), which, however were either separately used for pre-training, self …
speech recognition (ASR), which, however were either separately used for pre-training, self …
Contextual semi-supervised learning: An approach to leverage air-surveillance and untranscribed ATC data in ASR systems
Air traffic management and specifically air-traffic control (ATC) rely mostly on voice
communications between Air Traffic Controllers (ATCos) and pilots. In most cases, these …
communications between Air Traffic Controllers (ATCos) and pilots. In most cases, these …
Ilasr: privacy-preserving incremental learning for automatic speech recognition at production scale
Incremental learning is one paradigm to enable model building and updating at scale with
streaming data. For end-to-end automatic speech recognition (ASR) tasks, the absence of …
streaming data. For end-to-end automatic speech recognition (ASR) tasks, the absence of …
Modeling uncertainty in predicting emotional attributes from spontaneous speech
A challenging task in affective computing is to build reliable speech emotion recognition
(SER) systems that can accurately predict emotional attributes from spontaneous speech. To …
(SER) systems that can accurately predict emotional attributes from spontaneous speech. To …
Low Resource German ASR with Untranscribed Data Spoken by Non-native Children--INTERSPEECH 2021 Shared Task SPAPL System
This paper describes the SPAPL system for the INTERSPEECH 2021 Challenge: Shared
Task on Automatic Speech Recognition for Non-Native Children's Speech in German.~ 5 …
Task on Automatic Speech Recognition for Non-Native Children's Speech in German.~ 5 …
Semi-supervised end-to-end ASR via teacher-student learning with conditional posterior distribution
Encoder-decoder based methods have become popular for automatic speech recognition
(ASR), thanks to their simplified processing stages and low reliance on prior knowledge …
(ASR), thanks to their simplified processing stages and low reliance on prior knowledge …