Neural target speech extraction: An overview
K Zmolikova, M Delcroix, T Ochiai… - IEEE Signal …, 2023 - ieeexplore.ieee.org
Humans can listen to a target speaker even in challenging acoustic conditions that have
noise, reverberation, and interfering speakers. This phenomenon is known as the cocktail …
noise, reverberation, and interfering speakers. This phenomenon is known as the cocktail …
Recent progress in the CUHK dysarthric speech recognition system
Despite the rapid progress of automatic speech recognition (ASR) technologies in the past
few decades, recognition of disordered speech remains a highly challenging task to date …
few decades, recognition of disordered speech remains a highly challenging task to date …
Automatic speech recognition method based on deep learning approaches for Uzbek language
Communication has been an important aspect of human life, civilization, and globalization
for thousands of years. Biometric analysis, education, security, healthcare, and smart cities …
for thousands of years. Biometric analysis, education, security, healthcare, and smart cities …
Audio-visual end-to-end multi-channel speech separation, dereverberation and recognition
Accurate recognition of cocktail party speech containing overlapping speakers, noise and
reverberation remains a highly challenging task to date. Motivated by the invariance of …
reverberation remains a highly challenging task to date. Motivated by the invariance of …
Bayesian neural network language modeling for speech recognition
State-of-the-art neural network language models (NNLMs) represented by long short term
memory recurrent neural networks (LSTM-RNNs) and Transformers are becoming highly …
memory recurrent neural networks (LSTM-RNNs) and Transformers are becoming highly …
Restoring Speaking Lips from Occlusion for Audio-Visual Speech Recognition
Prior studies on audio-visual speech recognition typically assume the visibility of speaking
lips, ignoring the fact that visual occlusion occurs in real-world videos, thus adversely …
lips, ignoring the fact that visual occlusion occurs in real-world videos, thus adversely …
End-to-end multi-modal speech recognition on an air and bone conducted speech corpus
Automatic speech recognition (ASR) has been significantly improved in the past years.
However, most robust ASR systems are based on air-conducted (AC) speech, and their …
However, most robust ASR systems are based on air-conducted (AC) speech, and their …
Audio-visual multi-channel speech separation, dereverberation and recognition
Despite the rapid advance of automatic speech recognition (ASR) technologies, accurate
recognition of cocktail party speech characterised by the interference from overlapping …
recognition of cocktail party speech characterised by the interference from overlapping …
Multi-channel multi-speaker ASR using 3D spatial feature
Automatic speech recognition (ASR) of multi-channel multi-speaker overlapped speech
remains one of the most challenging tasks to the speech community. In this paper, we look …
remains one of the most challenging tasks to the speech community. In this paper, we look …
Mixed precision dnn quantization for overlapped speech separation and recognition
Recognition of overlapped speech has been a highly challenging task to date. State-of-the-
art multi-channel speech separation system are becoming increasingly complex and …
art multi-channel speech separation system are becoming increasingly complex and …