Speakerbeam: Speaker aware neural network for target speaker extraction in speech mixtures
The processing of speech corrupted by interfering overlapping speakers is one of the
challenging problems with regards to today's automatic speech recognition systems …
challenging problems with regards to today's automatic speech recognition systems …
A review on speech separation in cocktail party environment: challenges and approaches
The Cocktail party problem, which is tracing and identifying a specific speaker's speech
while numerous speakers communicate concurrently is one of the crucial problems still to be …
while numerous speakers communicate concurrently is one of the crucial problems still to be …
Combining spectral and spatial features for deep learning based blind speaker separation
This study tightly integrates complementary spectral and spatial features for deep learning
based multi-channel speaker separation in reverberant environments. The key idea is to …
based multi-channel speaker separation in reverberant environments. The key idea is to …
Deep learning based phase reconstruction for speaker separation: A trigonometric perspective
This study investigates phase reconstruction for deep learning based monaural talker-
independent speaker separation in the short-time Fourier transform (STFT) domain. The key …
independent speaker separation in the short-time Fourier transform (STFT) domain. The key …
Deep extractor network for target speaker recovery from single channel speech mixtures
Speaker-aware source separation methods are promising workarounds for major difficulties
such as arbitrary source permutation and unknown number of sources. However, it remains …
such as arbitrary source permutation and unknown number of sources. However, it remains …
A survey of unsupervised learning methods for high-dimensional uncertainty quantification in black-box-type problems
Constructing surrogate models for uncertainty quantification (UQ) on complex partial
differential equations (PDEs) having inherently high-dimensional O (10 n), n≥ 2, stochastic …
differential equations (PDEs) having inherently high-dimensional O (10 n), n≥ 2, stochastic …
End-to-end monaural multi-speaker ASR system without pretraining
Recently, end-to-end models have become a popular approach as an alternative to
traditional hybrid models in automatic speech recognition (ASR). The multi-speaker speech …
traditional hybrid models in automatic speech recognition (ASR). The multi-speaker speech …
Audio-visual end-to-end multi-channel speech separation, dereverberation and recognition
Accurate recognition of cocktail party speech containing overlapping speakers, noise and
reverberation remains a highly challenging task to date. Motivated by the invariance of …
reverberation remains a highly challenging task to date. Motivated by the invariance of …
Single-channel multi-talker speech recognition with permutation invariant training
Although great progress has been made in automatic speech recognition (ASR), significant
performance degradation is still observed when recognizing multi-talker mixed speech. In …
performance degradation is still observed when recognizing multi-talker mixed speech. In …
[PDF][PDF] Challenges and feasibility of automatic speech recognition for modeling student collaborative discourse in classrooms
Automatic speech recognition (ASR) has considerable potential to model aspects of
classroom discourse with the goals of automated assessment, feedback, and instructional …
classroom discourse with the goals of automated assessment, feedback, and instructional …