Far-field automatic speech recognition
The machine recognition of speech spoken at a distance from the microphones, known as
far-field automatic speech recognition (ASR), has received a significant increase in attention …
far-field automatic speech recognition (ASR), has received a significant increase in attention …
Speech processing for digital home assistants: Combining signal processing with deep-learning techniques
R Haeb-Umbach, S Watanabe… - IEEE Signal …, 2019 - ieeexplore.ieee.org
Once a popular theme of futuristic science fiction or far-fetched technology forecasts, digital
home assistants with a spoken language interface have become a ubiquitous commodity …
home assistants with a spoken language interface have become a ubiquitous commodity …
[PDF][PDF] Front-end processing for the CHiME-5 dinner party scenario
C Boeddeker, J Heitkaemper… - CHiME5 Workshop …, 2018 - isca-archive.org
This contribution presents a speech enhancement system for the CHiME-5 Dinner Party
Scenario. The front-end employs multi-channel linear time-variant filtering and achieves its …
Scenario. The front-end employs multi-channel linear time-variant filtering and achieves its …
A unified convolutional beamformer for simultaneous denoising and dereverberation
T Nakatani, K Kinoshita - IEEE Signal Processing Letters, 2019 - ieeexplore.ieee.org
This letter proposes a method for estimating a convolutional beamformer that can perform
denoising and dereverberation simultaneously in an optimal way. The application of …
denoising and dereverberation simultaneously in an optimal way. The application of …
Audio-visual end-to-end multi-channel speech separation, dereverberation and recognition
Accurate recognition of cocktail party speech containing overlapping speakers, noise and
reverberation remains a highly challenging task to date. Motivated by the invariance of …
reverberation remains a highly challenging task to date. Motivated by the invariance of …
End-to-end dereverberation, beamforming, and speech recognition in a cocktail party
Far-field multi-speaker automatic speech recognition (ASR) has drawn increasing attention
in recent years. Most existing methods feature a signal processing frontend and an ASR …
in recent years. Most existing methods feature a signal processing frontend and an ASR …
Meta-AF: Meta-learning for adaptive filters
J Casebeer, NJ Bryan… - IEEE/ACM Transactions …, 2022 - ieeexplore.ieee.org
Adaptive filtering algorithms are pervasive throughout signal processing and have had a
material impact on a wide variety of domains including audio processing …
material impact on a wide variety of domains including audio processing …
Jointly optimal dereverberation and beamforming
We previously proposed an optimal (in the maximum likelihood sense) convolutional
beamformer that can perform simultaneous denoising and dereverberation, and showed its …
beamformer that can perform simultaneous denoising and dereverberation, and showed its …
A time-domain real-valued generalized wiener filter for multi-channel neural separation systems
Y Luo - IEEE/ACM Transactions on Audio, Speech, and …, 2022 - ieeexplore.ieee.org
Frequency-domain beamformers have been successful in a wide range of multi-channel
neural separation systems in the past years. However, the operations in conventional …
neural separation systems in the past years. However, the operations in conventional …
Leveraging low-distortion target estimates for improved speech enhancement
A promising approach for multi-microphone speech separation involves two deep neural
networks (DNN), where the predicted target speech from the first DNN is used to compute …
networks (DNN), where the predicted target speech from the first DNN is used to compute …