Single-channel multi-talker speech recognition with permutation invariant training

Y Luo, N Mesgarani - IEEE/ACM transactions on audio, speech …, 2019 - ieeexplore.ieee.org

Single-channel, speaker-independent speech separation methods have recently seen great
progress. However, the accuracy, latency, and computational cost of such methods remain …

被引用次数：2153 相关文章所有 13 个版本

[HTML] zju.edu.cn

Past review, current progress, and challenges ahead on the cocktail party problem

Y Qian, C Weng, X Chang, S Wang, D Yu - Frontiers of Information …, 2018 - Springer

The cocktail party problem, ie, tracing and recognizing the speech of a specific speaker
when multiple speakers talk simultaneously, is one of the critical problems yet to be solved …

被引用次数：102 相关文章所有 6 个版本

[PDF] vut.cz

Single channel target speaker extraction and recognition with speaker beam

M Delcroix, K Zmolikova, K Kinoshita… - … on acoustics, speech …, 2018 - ieeexplore.ieee.org

This paper addresses the problem of single channel speech recognition of a target speaker
in a mixture of speech signals. We propose to exploit auxiliary speaker information provided …

被引用次数：219 相关文章所有 5 个版本

[PDF] arxiv.org

Deep extractor network for target speaker recovery from single channel speech mixtures

J Wang, J Chen, D Su, L Chen, M Yu, Y Qian… - arXiv preprint arXiv …, 2018 - arxiv.org

Speaker-aware source separation methods are promising workarounds for major difficulties
such as arbitrary source permutation and unknown number of sources. However, it remains …

被引用次数：104 相关文章所有 7 个版本

[PDF] arxiv.org

End-to-end monaural multi-speaker ASR system without pretraining

X Chang, Y Qian, K Yu… - ICASSP 2019-2019 IEEE …, 2019 - ieeexplore.ieee.org

Recently, end-to-end models have become a popular approach as an alternative to
traditional hybrid models in automatic speech recognition (ASR). The multi-speaker speech …

被引用次数：88 相关文章所有 5 个版本

[PDF] arxiv.org

MIMO-Speech: End-to-end multi-channel multi-speaker speech recognition

X Chang, W Zhang, Y Qian, J Le Roux… - 2019 IEEE Automatic …, 2019 - ieeexplore.ieee.org

Recently, the end-to-end approach has proven its efficacy in monaural multi-speaker speech
recognition. However, high word error rates (WERs) still prevent these systems from being …

被引用次数：125 相关文章所有 10 个版本

[PDF] uni-paderborn.de

End-to-end dereverberation, beamforming, and speech recognition in a cocktail party

W Zhang, X Chang, C Boeddeker… - … on Audio, Speech …, 2022 - ieeexplore.ieee.org

Far-field multi-speaker automatic speech recognition (ASR) has drawn increasing attention
in recent years. Most existing methods feature a signal processing frontend and an ASR …

被引用次数：18 相关文章所有 5 个版本

[PDF] arxiv.org

Progressive joint modeling in unsupervised single-channel overlapped speech recognition

Z Chen, J Droppo, J Li, W Xiong - IEEE/ACM Transactions on …, 2017 - ieeexplore.ieee.org

Unsupervised single-channel overlapped speech recognition is one of the hardest problems
in automatic speech recognition (ASR). Permutation invariant training (PIT) is a state of the …

被引用次数：87 相关文章所有 4 个版本

[PDF] arxiv.org

Learning to enhance or not: Neural network-based switching of enhanced and observed signals for overlapping speech recognition

H Sato, T Ochiai, M Delcroix, K Kinoshita… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

The combination of a deep neural network (DNN)-based speech enhancement (SE) front-
end and an automatic speech recognition (ASR) back-end is a widely used approach to …

被引用次数：27 相关文章所有 3 个版本

[PDF] arxiv.org

Multi-talker ASR for an unknown number of sources: Joint training of source counting, separation and ASR

T von Neumann, C Boeddeker, L Drude… - arXiv preprint arXiv …, 2020 - arxiv.org

Most approaches to multi-talker overlapped speech separation and recognition assume that
the number of simultaneously active speakers is given, but in realistic situations, it is typically …

被引用次数：49 相关文章所有 10 个版本