Past review, current progress, and challenges ahead on the cocktail party problem

K Žmolíková, M Delcroix, K Kinoshita… - IEEE Journal of …, 2019 - ieeexplore.ieee.org

The processing of speech corrupted by interfering overlapping speakers is one of the
challenging problems with regards to today's automatic speech recognition systems …

被引用次数：230 相关文章所有 5 个版本

A review on speech separation in cocktail party environment: challenges and approaches

J Agrawal, M Gupta, H Garg - Multimedia Tools and Applications, 2023 - Springer

The Cocktail party problem, which is tracing and identifying a specific speaker's speech
while numerous speakers communicate concurrently is one of the crucial problems still to be …

被引用次数：16 相关文章所有 3 个版本

[PDF] ieee.org

Combining spectral and spatial features for deep learning based blind speaker separation

ZQ Wang, DL Wang - … ACM Transactions on audio, speech, and …, 2018 - ieeexplore.ieee.org

This study tightly integrates complementary spectral and spatial features for deep learning
based multi-channel speaker separation in reverberant environments. The key idea is to …

被引用次数：151 相关文章所有 4 个版本

[PDF] arxiv.org

Deep learning based phase reconstruction for speaker separation: A trigonometric perspective

ZQ Wang, K Tan, DL Wang - ICASSP 2019-2019 IEEE …, 2019 - ieeexplore.ieee.org

This study investigates phase reconstruction for deep learning based monaural talker-
independent speaker separation in the short-time Fourier transform (STFT) domain. The key …

被引用次数：107 相关文章所有 10 个版本

[PDF] arxiv.org

Deep extractor network for target speaker recovery from single channel speech mixtures

J Wang, J Chen, D Su, L Chen, M Yu, Y Qian… - arXiv preprint arXiv …, 2018 - arxiv.org

Speaker-aware source separation methods are promising workarounds for major difficulties
such as arbitrary source permutation and unknown number of sources. However, it remains …

被引用次数：104 相关文章所有 7 个版本

[PDF] sciencedirect.com

A survey of unsupervised learning methods for high-dimensional uncertainty quantification in black-box-type problems

K Kontolati, D Loukrezis, DG Giovanis… - Journal of …, 2022 - Elsevier

Constructing surrogate models for uncertainty quantification (UQ) on complex partial
differential equations (PDEs) having inherently high-dimensional O (10 n), n≥ 2, stochastic …

被引用次数：44 相关文章所有 6 个版本

[PDF] arxiv.org

End-to-end monaural multi-speaker ASR system without pretraining

X Chang, Y Qian, K Yu… - ICASSP 2019-2019 IEEE …, 2019 - ieeexplore.ieee.org

Recently, end-to-end models have become a popular approach as an alternative to
traditional hybrid models in automatic speech recognition (ASR). The multi-speaker speech …

被引用次数：88 相关文章所有 5 个版本

[PDF] arxiv.org

Audio-visual end-to-end multi-channel speech separation, dereverberation and recognition

G Li, J Deng, M Geng, Z Jin, T Wang… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org

Accurate recognition of cocktail party speech containing overlapping speakers, noise and
reverberation remains a highly challenging task to date. Motivated by the invariance of …

被引用次数：12 相关文章所有 6 个版本

[PDF] arxiv.org

Single-channel multi-talker speech recognition with permutation invariant training

Y Qian, X Chang, D Yu - Speech Communication, 2018 - Elsevier

Although great progress has been made in automatic speech recognition (ASR), significant
performance degradation is still observed when recognizing multi-talker mixed speech. In …

被引用次数：92 相关文章所有 4 个版本

[PDF] nsf.gov

[PDF][PDF] Challenges and feasibility of automatic speech recognition for modeling student collaborative discourse in classrooms

R Southwell, S Pugh, M Perkoff, C Clevenger… - … Data Mining Society, 2022 - par.nsf.gov

Automatic speech recognition (ASR) has considerable potential to model aspects of
classroom discourse with the goals of automated assessment, feedback, and instructional …

被引用次数：26 相关文章所有 5 个版本