Joint speaker diarization and speech recognition based on region proposal networks

H Liu, B MacWhinney, D Fromm, A Lanzi - Journal of Speech, Language, and …, 2023 - ASHA

Purpose: A major barrier to the wider use of language sample analysis (LSA) is the fact that
transcription is very time intensive. Methods that can reduce the required time and effort …

被引用次数：28 相关文章所有 13 个版本

[PDF] arxiv.org

Attention-Based Encoder-Decoder End-to-End Neural Diarization With Embedding Enhancer

Z Chen, B Han, S Wang, Y Qian - IEEE/ACM Transactions on …, 2024 - ieeexplore.ieee.org

Deep neural network-based systems have significantly improved the performance of
speaker diarization tasks. However, end-to-end neural diarization (EEND) systems often …

被引用次数：17 相关文章所有 5 个版本

[PDF] arxiv.org

Meeting recognition with continuous speech separation and transcription-supported diarization

T Von Neumann, C Boeddeker… - … , Speech, and Signal …, 2024 - ieeexplore.ieee.org

We propose a modular pipeline for the single-channel separation, recognition, and
diarization of meeting-style recordings and evaluate it on the Libri-CSS dataset. Using a …

被引用次数：7 相关文章所有 2 个版本

[PDF] arxiv.org

Unified modeling of multi-talker overlapped speech recognition and diarization with a sidecar separator

L Meng, J Kang, M Cui, H Wu, X Wu… - arXiv preprint arXiv …, 2023 - arxiv.org

Multi-talker overlapped speech poses a significant challenge for speech recognition and
diarization. Recent research indicated that these two tasks are inter-dependent and …

被引用次数：11 相关文章所有 6 个版本

Optimized speaker change detection approach for speaker segmentation towards speaker diarization based on deep learning

K VijayKumar - Data & Knowledge Engineering, 2023 - Elsevier

Speaker diarization is the partitioning of an audio source stream into homogeneous
segments according to the speaker's identity. It can improve the readability of an automatic …

被引用次数：5 相关文章

Sec-gan for robust speaker recognition with emotional state dismatch

D Li, Z Yang, Z Wang, M Hua - Biomedical Signal Processing and Control, 2023 - Elsevier

Speaker recognition is often dependent on the speaker and susceptible to emotional factors,
thus mostly decreasing recognition performance, we propose a framework by combining …

被引用次数：1 相关文章