Wide residual BLSTM network with discriminative speaker adaptation for robust speech recognition

Z Zhang, J Geiger, J Pohjalainen, AED Mousa… - ACM Transactions on …, 2018 - dl.acm.org

Eliminating the negative effect of non-stationary environmental noise is a long-standing
research topic for automatic speech recognition but still remains an important challenge …

被引用次数：410 相关文章所有 10 个版本

[PDF] ieee.org

Complex spectral mapping for single-and multi-channel speech enhancement and robust ASR

ZQ Wang, P Wang, DL Wang - IEEE/ACM transactions on …, 2020 - ieeexplore.ieee.org

This study proposes a complex spectral mapping approach for single-and multi-channel
speech enhancement, where deep neural networks (DNNs) are used to predict the real and …

被引用次数：208 相关文章所有 12 个版本

[PDF] uni-paderborn.de

NARA-WPE: A Python package for weighted prediction error dereverberation in Numpy and Tensorflow for online and offline processing

L Drude, J Heymann, C Boeddeker… - … 13th ITG-Symposium, 2018 - ieeexplore.ieee.org

NARA-WPE is a Python software package providing implementations of the weighted
prediction error (WPE) dereverberation algorithm. WPE has been shown to be a highly …

被引用次数：119 相关文章所有 6 个版本

[PDF] uni-paderborn.de

Beamnet: End-to-end training of a beamformer-supported multi-channel ASR system

J Heymann, L Drude, C Boeddeker… - … , Speech and Signal …, 2017 - ieeexplore.ieee.org

This paper presents an end-to-end training approach for a beamformer-supported multi-
channel ASR system. A neural network which estimates masks for a statistically optimum …

被引用次数：147 相关文章所有 7 个版本

[PDF] merl.com

Unified architecture for multichannel end-to-end speech recognition with neural beamforming

T Ochiai, S Watanabe, T Hori… - IEEE Journal of …, 2017 - ieeexplore.ieee.org

This paper proposes a unified architecture for end-to-end automatic speech recognition
(ASR) to encompass microphone-array signal processing such as a state-of-the-art neural …

被引用次数：104 相关文章所有 7 个版本

[PDF] arxiv.org

Audio-visual speech separation and dereverberation with a two-stage multimodal network

K Tan, Y Xu, SX Zhang, M Yu… - IEEE Journal of Selected …, 2020 - ieeexplore.ieee.org

Background noise, interfering speech and room reverberation frequently distort target
speech in real listening environments. In this study, we address joint speech separation and …

被引用次数：62 相关文章所有 4 个版本

[PDF] ieee.org

Bridging the gap between monaural speech enhancement and recognition with distortion-independent acoustic modeling

P Wang, K Tan - IEEE/ACM Transactions on Audio, Speech …, 2019 - ieeexplore.ieee.org

Monaural speech enhancement has made dramatic advances since the introduction of deep
learning a few years ago. Although enhanced speech has been demonstrated to have better …

被引用次数：67 相关文章所有 16 个版本

[PDF] arxiv.org

Unsupervised training of a deep clustering model for multichannel blind source separation

L Drude, D Hasenklever… - ICASSP 2019-2019 …, 2019 - ieeexplore.ieee.org

We propose a training scheme to train neural network-based source separation algorithms
from scratch when parallel clean data is unavailable. In particular, we demonstrate that an …

被引用次数：64 相关文章所有 9 个版本

[PDF] arxiv.org

Dual application of speech enhancement for automatic speech recognition

A Pandey, C Liu, Y Wang… - 2021 IEEE Spoken …, 2021 - ieeexplore.ieee.org

In this work, we exploit speech enhancement for improving a re-current neural network
transducer (RNN-T) based ASR system. We employ a dense convolutional recurrent …

被引用次数：38 相关文章所有 5 个版本

[PDF] uni-paderborn.de

Integration of neural networks and probabilistic spatial models for acoustic blind source separation

L Drude, R Haeb-Umbach - IEEE Journal of Selected Topics in …, 2019 - ieeexplore.ieee.org

We formulate a generic framework for blind source separation (BSS), which allows
integrating data-driven spectro-temporal methods, such as deep clustering and deep …

被引用次数：47 相关文章所有 5 个版本