Far-field automatic speech recognition

R Haeb-Umbach, J Heymann, L Drude… - Proceedings of the …, 2020 - ieeexplore.ieee.org
The machine recognition of speech spoken at a distance from the microphones, known as
far-field automatic speech recognition (ASR), has received a significant increase in attention …

Speech processing for digital home assistants: Combining signal processing with deep-learning techniques

R Haeb-Umbach, S Watanabe… - IEEE Signal …, 2019 - ieeexplore.ieee.org
Once a popular theme of futuristic science fiction or far-fetched technology forecasts, digital
home assistants with a spoken language interface have become a ubiquitous commodity …

[PDF][PDF] Front-end processing for the CHiME-5 dinner party scenario

C Boeddeker, J Heitkaemper… - CHiME5 Workshop …, 2018 - isca-archive.org
This contribution presents a speech enhancement system for the CHiME-5 Dinner Party
Scenario. The front-end employs multi-channel linear time-variant filtering and achieves its …

A unified convolutional beamformer for simultaneous denoising and dereverberation

T Nakatani, K Kinoshita - IEEE Signal Processing Letters, 2019 - ieeexplore.ieee.org
This letter proposes a method for estimating a convolutional beamformer that can perform
denoising and dereverberation simultaneously in an optimal way. The application of …

Audio-visual end-to-end multi-channel speech separation, dereverberation and recognition

G Li, J Deng, M Geng, Z Jin, T Wang… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org
Accurate recognition of cocktail party speech containing overlapping speakers, noise and
reverberation remains a highly challenging task to date. Motivated by the invariance of …

End-to-end dereverberation, beamforming, and speech recognition in a cocktail party

W Zhang, X Chang, C Boeddeker… - … on Audio, Speech …, 2022 - ieeexplore.ieee.org
Far-field multi-speaker automatic speech recognition (ASR) has drawn increasing attention
in recent years. Most existing methods feature a signal processing frontend and an ASR …

Meta-AF: Meta-learning for adaptive filters

J Casebeer, NJ Bryan… - IEEE/ACM Transactions …, 2022 - ieeexplore.ieee.org
Adaptive filtering algorithms are pervasive throughout signal processing and have had a
material impact on a wide variety of domains including audio processing …

Jointly optimal dereverberation and beamforming

C Boeddeker, T Nakatani, K Kinoshita… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org
We previously proposed an optimal (in the maximum likelihood sense) convolutional
beamformer that can perform simultaneous denoising and dereverberation, and showed its …

A time-domain real-valued generalized wiener filter for multi-channel neural separation systems

Y Luo - IEEE/ACM Transactions on Audio, Speech, and …, 2022 - ieeexplore.ieee.org
Frequency-domain beamformers have been successful in a wide range of multi-channel
neural separation systems in the past years. However, the operations in conventional …

Leveraging low-distortion target estimates for improved speech enhancement

ZQ Wang, G Wichern, JL Roux - arXiv preprint arXiv:2110.00570, 2021 - arxiv.org
A promising approach for multi-microphone speech separation involves two deep neural
networks (DNN), where the predicted target speech from the first DNN is used to compute …