Light gated recurrent units for speech recognition
M Ravanelli, P Brakel, M Omologo… - IEEE Transactions on …, 2018 - ieeexplore.ieee.org
A field that has directly benefited from the recent advances in deep learning is automatic
speech recognition (ASR). Despite the great achievements of the past decades, however, a …
speech recognition (ASR). Despite the great achievements of the past decades, however, a …
Continuous speech separation: Dataset and analysis
This paper describes a dataset and protocols for evaluating continuous speech separation
algorithms. Most prior speech separation studies use pre-segmented audio signals, which …
algorithms. Most prior speech separation studies use pre-segmented audio signals, which …
Far-field automatic speech recognition
The machine recognition of speech spoken at a distance from the microphones, known as
far-field automatic speech recognition (ASR), has received a significant increase in attention …
far-field automatic speech recognition (ASR), has received a significant increase in attention …
Generalization of multi-channel linear prediction methods for blind MIMO impulse response shortening
T Yoshioka, T Nakatani - IEEE Transactions on Audio, Speech …, 2012 - ieeexplore.ieee.org
The performance of many microphone array processing techniques deteriorates in the
presence of reverberation. To provide a widely applicable solution to this longstanding …
presence of reverberation. To provide a widely applicable solution to this longstanding …
Making machines understand us in reverberant rooms: Robustness against reverberation for automatic speech recognition
Speech recognition technology has left the research laboratory and is increasingly coming
into practical use, enabling a wide spectrum of innovative and exciting voice-driven …
into practical use, enabling a wide spectrum of innovative and exciting voice-driven …
Audio user interaction recognition and application interface
Disclosed is an application interface that takes into account the user's gaze direction relative
to who is speaking in an interactive multi-participant environment where audio-based …
to who is speaking in an interactive multi-participant environment where audio-based …
Advances in online audio-visual meeting transcription
T Yoshioka, I Abramovski, C Aksoylar… - 2019 IEEE Automatic …, 2019 - ieeexplore.ieee.org
This paper describes a system that generates speaker-annotated transcripts of meetings by
using a microphone array and a 360-degree camera. The hallmark of the system is its ability …
using a microphone array and a 360-degree camera. The hallmark of the system is its ability …
Method and apparatus for detecting speech endpoint using weighted finite state transducer
H Chung, S Lee, YK Lee - US Patent 9,396,722, 2016 - Google Patents
Disclosed are an apparatus and a method for detecting a speech endpoint using a WFST.
The apparatus in accordance with an embodiment of the present invention includes: a …
The apparatus in accordance with an embodiment of the present invention includes: a …
Online MVDR beamformer based on complex Gaussian mixture model with spatial prior for noise robust ASR
This paper considers acoustic beamforming for noise robust automatic speech recognition.
A beamformer attenuates background noise by enhancing sound components coming from …
A beamformer attenuates background noise by enhancing sound components coming from …
Recognizing overlapped speech in meetings: A multichannel separation approach using neural networks
The goal of this work is to develop a meeting transcription system that can recognize speech
even when utterances of different speakers are overlapped. While speech overlaps have …
even when utterances of different speakers are overlapped. While speech overlaps have …