Joint training of complex ratio mask based beamformer and acoustic model for noise robust asr

Z Zhang, Y Xu, M Yu, SX Zhang… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org

Speech separation algorithms are often used to separate the target speech from other
interfering sources. However, purely neural network based speech separation systems often …

被引用次数：141 相关文章所有 7 个版本

Noise robust automatic speech recognition: review and analysis

M Dua, Akanksha, S Dua - International Journal of Speech Technology, 2023 - Springer

Abstract Automatic Speech Recognition (ASR) system is an emerging technology used in
various fields such as robotics, traffic controls, and healthcare, etc. The leading cause of …

被引用次数：8 相关文章所有 2 个版本

[PDF] academia.edu

Speech enhancement using end-to-end speech recognition objectives

AS Subramanian, X Wang, MK Baskar… - … IEEE Workshop on …, 2019 - ieeexplore.ieee.org

Speech enhancement systems, which denoise and dereverberate distorted signals, are
usually optimized based on signal reconstruction objectives including the maximum …

被引用次数：66 相关文章所有 8 个版本

[PDF] nsf.gov

Robust speaker recognition based on single-channel and multi-channel speech enhancement

H Taherian, ZQ Wang, J Chang… - IEEE/ACM Transactions …, 2020 - ieeexplore.ieee.org

Deep neural network (DNN) embeddings for speaker recognition have recently attracted
much attention. Compared to i-vectors, they are more robust to noise and room …

被引用次数：65 相关文章所有 6 个版本

[PDF] arxiv.org

Audio-visual end-to-end multi-channel speech separation, dereverberation and recognition

G Li, J Deng, M Geng, Z Jin, T Wang… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org

Accurate recognition of cocktail party speech containing overlapping speakers, noise and
reverberation remains a highly challenging task to date. Motivated by the invariance of …

被引用次数：12 相关文章所有 6 个版本

[PDF] uni-paderborn.de

End-to-end dereverberation, beamforming, and speech recognition in a cocktail party

W Zhang, X Chang, C Boeddeker… - … on Audio, Speech …, 2022 - ieeexplore.ieee.org

Far-field multi-speaker automatic speech recognition (ASR) has drawn increasing attention
in recent years. Most existing methods feature a signal processing frontend and an ASR …

被引用次数：18 相关文章所有 5 个版本

[PDF] arxiv.org

On loss functions and recurrency training for GAN-based speech enhancement systems

Z Zhang, C Deng, Y Shen, DS Williamson… - arXiv preprint arXiv …, 2020 - arxiv.org

Recent work has shown that it is feasible to use generative adversarial networks (GANs) for
speech enhancement, however, these approaches have not been compared to state-of-the …

被引用次数：46 相关文章所有 13 个版本

[PDF] arxiv.org

Multi-channel multi-frame ADL-MVDR for target speech separation

Z Zhang, Y Xu, M Yu, SX Zhang, L Chen… - … on Audio, Speech …, 2021 - ieeexplore.ieee.org

Many purely neural network based speech separation approaches have been proposed to
improve objective assessment scores, but they often introduce nonlinear distortions that are …

被引用次数：34 相关文章所有 5 个版本

[PDF] arxiv.org

Self-attention generative adversarial network for speech enhancement

H Phan, H Le Nguyen, OY Chén, P Koch… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org

Existing generative adversarial networks (GANs) for speech enhancement solely rely on the
convolution operation, which may obscure temporal dependencies across the sequence …

被引用次数：39 相关文章所有 11 个版本

[PDF] arxiv.org

Neural spatio-temporal beamformer for target speech separation

Y Xu, M Yu, SX Zhang, L Chen, C Weng, J Liu… - arXiv preprint arXiv …, 2020 - arxiv.org

Purely neural network (NN) based speech separation and enhancement methods, although
can achieve good objective scores, inevitably cause nonlinear speech distortions that are …

被引用次数：43 相关文章所有 8 个版本