Similarity measurement of segment-level speaker embeddings in speaker diarization

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier

The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

被引用次数：189 相关文章所有 6 个版本

[HTML] sciencedirect.com

[HTML][HTML] An experimental review of speaker diarization methods with application to two-speaker conversational telephone speech recordings

L Serafini, S Cornell, G Morrone, E Zovato… - Computer Speech & …, 2023 - Elsevier

We performed an experimental review of current diarization systems for the conversational
telephone speech (CTS) domain. In detail, we considered a total of eight different algorithms …

被引用次数：8 相关文章所有 6 个版本

[PDF] arxiv.org

Target-speaker voice activity detection via sequence-to-sequence prediction

M Cheng, W Wang, Y Zhang, X Qin… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org

Target-speaker voice activity detection is currently a promising approach for speaker
diarization in complex acoustic environments. This paper presents a novel Sequence-to …

被引用次数：34 相关文章所有 4 个版本

[PDF] arxiv.org

Supervised hierarchical clustering using graph neural networks for speaker diarization

P Singh, A Kaul, S Ganapathy - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org

Conventional methods for speaker diarization involve windowing an audio file into short
segments to extract speaker embeddings, followed by an unsupervised clustering of the …

被引用次数：10 相关文章所有 4 个版本

[PDF] arxiv.org

Multi-input multi-output target-speaker voice activity detection for unified, flexible, and robust audio-visual speaker diarization

M Cheng, M Li - arXiv preprint arXiv:2401.08052, 2024 - arxiv.org

Audio-visual learning has demonstrated promising results in many classical speech tasks
(eg, speech separation, automatic speech recognition, wake-word spotting). We believe that …

被引用次数：6 相关文章所有 2 个版本

[PDF] arxiv.org

End-to-end Online Speaker Diarization with Target Speaker Tracking

W Wang, M Li - arXiv preprint arXiv:2310.08696, 2023 - arxiv.org

This paper proposes an online target speaker voice activity detection system for speaker
diarization tasks, which does not require a priori knowledge from the clustering-based …

被引用次数：5 相关文章所有 2 个版本

[PDF] ox.ac.uk

[PDF][PDF] The dku-smiip diarization system for the voxceleb speaker recognition challenge 2022

W Wang, X Qin, M Cheng, Y Zhang, K Wang… - Voxsrc Workshop, 2022 - robots.ox.ac.uk

This paper discribes the DKU-SMIIP submission to the 4th track of the VoxCeleb Speaker
Recognition Challenge 2022 (VoxSRC-22). Our system contains a fused voice activity …

被引用次数：9 相关文章所有 2 个版本

[PDF] arxiv.org

The dku-dukeece diarization system for the voxceleb speaker recognition challenge 2022

W Wang, X Qin, M Cheng, Y Zhang, K Wang… - arXiv preprint arXiv …, 2022 - arxiv.org

This paper discribes the DKU-DukeECE submission to the 4th track of the VoxCeleb
Speaker Recognition Challenge 2022 (VoxSRC-22). Our system contains a fused voice …

被引用次数：9 相关文章所有 2 个版本

[PDF] arxiv.org

The dku-msxf diarization system for the voxceleb speaker recognition challenge 2023

M Cheng, W Wang, X Qin, Y Lin, N Jiang… - National Conference on …, 2023 - Springer

This paper describes the DKU-MSXF submission to track 4 of the VoxCeleb Speaker
Recognition Challenge 2023 (VoxSRC-23). Our system pipeline contains voice activity …

被引用次数：7 相关文章所有 7 个版本

[PDF] arxiv.org

Multi-target extractor and detector for unknown-number speaker diarization

CY Cheng, HS Lee, Y Tsao… - IEEE Signal Processing …, 2023 - ieeexplore.ieee.org

Strong representations of target speakers can help extract important information about
speakers and detect corresponding temporal regions in multi-speaker conversations. In this …

被引用次数：11 相关文章所有 3 个版本