Neural target speech extraction: An overview
K Zmolikova, M Delcroix, T Ochiai… - IEEE Signal …, 2023 - ieeexplore.ieee.org
Humans can listen to a target speaker even in challenging acoustic conditions that have
noise, reverberation, and interfering speakers. This phenomenon is known as the cocktail …
noise, reverberation, and interfering speakers. This phenomenon is known as the cocktail …
Speakerbeam: Speaker aware neural network for target speaker extraction in speech mixtures
The processing of speech corrupted by interfering overlapping speakers is one of the
challenging problems with regards to today's automatic speech recognition systems …
challenging problems with regards to today's automatic speech recognition systems …
Single channel target speaker extraction and recognition with speaker beam
This paper addresses the problem of single channel speech recognition of a target speaker
in a mixture of speech signals. We propose to exploit auxiliary speaker information provided …
in a mixture of speech signals. We propose to exploit auxiliary speaker information provided …
Lessons from building acoustic models with a million hours of speech
SHK Parthasarathi, N Strom - ICASSP 2019-2019 IEEE …, 2019 - ieeexplore.ieee.org
This is a report of our lessons learned building acoustic models from 1 Million hours of
unlabeled speech, while labeled speech is restricted to 7,000 hours. We employ …
unlabeled speech, while labeled speech is restricted to 7,000 hours. We employ …
Deep extractor network for target speaker recovery from single channel speech mixtures
Speaker-aware source separation methods are promising workarounds for major difficulties
such as arbitrary source permutation and unknown number of sources. However, it remains …
such as arbitrary source permutation and unknown number of sources. However, it remains …
[HTML][HTML] Bioinspired dual-channel speech recognition using graphene-based electromyographic and mechanical sensors
Automatic speech recognition (ASR) is helpful to improve quality of life. However, the
performance of ASR degrades in the case of noisy environment, limited privacy, and speech …
performance of ASR degrades in the case of noisy environment, limited privacy, and speech …
Improving noise robustness of automatic speech recognition via parallel data and teacher-student learning
For real-world speech recognition applications, noise robustness is still a challenge. In this
work, we adopt the teacher-student (T/S) learning technique using a parallel clean and noisy …
work, we adopt the teacher-student (T/S) learning technique using a parallel clean and noisy …
Developing far-field speaker system via teacher-student learning
In this study, we develop the keyword spotting (KWS) and acoustic model (AM) components
in a far-field speaker system. Specifically, we use teacher-student (T/S) learning to adapt a …
in a far-field speaker system. Specifically, we use teacher-student (T/S) learning to adapt a …
Frequency domain multi-channel acoustic modeling for distant speech recognition
Conventional far-field automatic speech recognition (ASR) systems typically employ
microphone array techniques for speech enhancement in order to improve robustness …
microphone array techniques for speech enhancement in order to improve robustness …
Slot-triggered contextual biasing for personalized speech recognition using neural transducers
End-to-end (E2E) automatic speech recognition (ASR) models have been found to perform
well on general transcription tasks but often fail to correctly recognize words that occur …
well on general transcription tasks but often fail to correctly recognize words that occur …