ADL-MVDR: All deep learning MVDR beamformer for target speech separation

Z Zhang, Y Xu, M Yu, SX Zhang… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
Speech separation algorithms are often used to separate the target speech from other
interfering sources. However, purely neural network based speech separation systems often …

Noise robust automatic speech recognition: review and analysis

M Dua, Akanksha, S Dua - International Journal of Speech Technology, 2023 - Springer
Abstract Automatic Speech Recognition (ASR) system is an emerging technology used in
various fields such as robotics, traffic controls, and healthcare, etc. The leading cause of …

Speech enhancement using end-to-end speech recognition objectives

AS Subramanian, X Wang, MK Baskar… - … IEEE Workshop on …, 2019 - ieeexplore.ieee.org
Speech enhancement systems, which denoise and dereverberate distorted signals, are
usually optimized based on signal reconstruction objectives including the maximum …

Robust speaker recognition based on single-channel and multi-channel speech enhancement

H Taherian, ZQ Wang, J Chang… - IEEE/ACM Transactions …, 2020 - ieeexplore.ieee.org
Deep neural network (DNN) embeddings for speaker recognition have recently attracted
much attention. Compared to i-vectors, they are more robust to noise and room …

Audio-visual end-to-end multi-channel speech separation, dereverberation and recognition

G Li, J Deng, M Geng, Z Jin, T Wang… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org
Accurate recognition of cocktail party speech containing overlapping speakers, noise and
reverberation remains a highly challenging task to date. Motivated by the invariance of …

End-to-end dereverberation, beamforming, and speech recognition in a cocktail party

W Zhang, X Chang, C Boeddeker… - … on Audio, Speech …, 2022 - ieeexplore.ieee.org
Far-field multi-speaker automatic speech recognition (ASR) has drawn increasing attention
in recent years. Most existing methods feature a signal processing frontend and an ASR …

On loss functions and recurrency training for GAN-based speech enhancement systems

Z Zhang, C Deng, Y Shen, DS Williamson… - arXiv preprint arXiv …, 2020 - arxiv.org
Recent work has shown that it is feasible to use generative adversarial networks (GANs) for
speech enhancement, however, these approaches have not been compared to state-of-the …

Multi-channel multi-frame ADL-MVDR for target speech separation

Z Zhang, Y Xu, M Yu, SX Zhang, L Chen… - … on Audio, Speech …, 2021 - ieeexplore.ieee.org
Many purely neural network based speech separation approaches have been proposed to
improve objective assessment scores, but they often introduce nonlinear distortions that are …

Self-attention generative adversarial network for speech enhancement

H Phan, H Le Nguyen, OY Chén, P Koch… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
Existing generative adversarial networks (GANs) for speech enhancement solely rely on the
convolution operation, which may obscure temporal dependencies across the sequence …

Neural spatio-temporal beamformer for target speech separation

Y Xu, M Yu, SX Zhang, L Chen, C Weng, J Liu… - arXiv preprint arXiv …, 2020 - arxiv.org
Purely neural network (NN) based speech separation and enhancement methods, although
can achieve good objective scores, inevitably cause nonlinear speech distortions that are …