Deep learning for environmentally robust speech recognition: An overview of recent developments
Eliminating the negative effect of non-stationary environmental noise is a long-standing
research topic for automatic speech recognition but still remains an important challenge …
research topic for automatic speech recognition but still remains an important challenge …
Adaptation algorithms for neural network-based speech recognition: An overview
We present a structured overview of adaptation algorithms for neural network-based speech
recognition, considering both hybrid hidden Markov model/neural network systems and end …
recognition, considering both hybrid hidden Markov model/neural network systems and end …
Conditional diffusion probabilistic model for speech enhancement
Speech enhancement is a critical component of many user-oriented audio applications, yet
current systems still suffer from distorted and unnatural outputs. While generative models …
current systems still suffer from distorted and unnatural outputs. While generative models …
CHiME-6 challenge: Tackling multispeaker speech recognition for unsegmented recordings
Following the success of the 1st, 2nd, 3rd, 4th and 5th CHiME challenges we organize the
6th CHiME Speech Separation and Recognition Challenge (CHiME-6). The new challenge …
6th CHiME Speech Separation and Recognition Challenge (CHiME-6). The new challenge …
Hybrid CTC/attention architecture for end-to-end speech recognition
Conventional automatic speech recognition (ASR) based on a hidden Markov model
(HMM)/deep neural network (DNN) is a very complicated system consisting of various …
(HMM)/deep neural network (DNN) is a very complicated system consisting of various …
Joint CTC-attention based end-to-end speech recognition using multi-task learning
Recently, there has been an increasing interest in end-to-end speech recognition that
directly transcribes speech to text without any predefined alignments. One approach is the …
directly transcribes speech to text without any predefined alignments. One approach is the …
Robust self-supervised audio-visual speech recognition
Audio-based automatic speech recognition (ASR) degrades significantly in noisy
environments and is particularly vulnerable to interfering speech, as the model cannot …
environments and is particularly vulnerable to interfering speech, as the model cannot …
The fifth'CHiME'speech separation and recognition challenge: dataset, task and baselines
The CHiME challenge series aims to advance robust automatic speech recognition (ASR)
technology by promoting research at the interface of speech and language processing …
technology by promoting research at the interface of speech and language processing …
Complex spectral mapping for single-and multi-channel speech enhancement and robust ASR
This study proposes a complex spectral mapping approach for single-and multi-channel
speech enhancement, where deep neural networks (DNNs) are used to predict the real and …
speech enhancement, where deep neural networks (DNNs) are used to predict the real and …
ESPnet: End-to-end speech processing toolkit
This paper introduces a new open source platform for end-to-end speech processing named
ESPnet. ESPnet mainly focuses on end-to-end automatic speech recognition (ASR), and …
ESPnet. ESPnet mainly focuses on end-to-end automatic speech recognition (ASR), and …