A survey of speaker recognition: Fundamental theories, recognition methods and opportunities

MM Kabir, MF Mridha, J Shin, I Jahan, AQ Ohi - IEEE Access, 2021 - ieeexplore.ieee.org
Humans can identify a speaker by listening to their voice, over the telephone, or on any
digital devices. Acquiring this congenital human competency, authentication technologies …

The third DIHARD diarization challenge

N Ryant, P Singh, V Krishnamohan, R Varma… - arXiv preprint arXiv …, 2020 - arxiv.org
DIHARD III was the third in a series of speaker diarization challenges intended to improve
the robustness of diarization systems to variability in recording equipment, noise conditions …

Spot the conversation: speaker diarisation in the wild

JS Chung, J Huh, A Nagrani, T Afouras… - arXiv preprint arXiv …, 2020 - arxiv.org
The goal of this paper is speaker diarisation of videos collected'in the wild'. We make three
key contributions. First, we propose an automatic audio-visual diarisation method for …

The second dihard diarization challenge: Dataset, task, and baselines

N Ryant, K Church, C Cieri, A Cristia, J Du… - arXiv preprint arXiv …, 2019 - arxiv.org
This paper introduces the second DIHARD challenge, the second in a series of speaker
diarization challenges intended to improve the robustness of diarization systems to variation …

Overlapping speech detection using long-term conversational features for speaker diarization in meeting room conversations

SH Yella, H Bourlard - IEEE/ACM Transactions on Audio …, 2014 - ieeexplore.ieee.org
Overlapping speech has been identified as one of the main sources of errors in diarization of
meeting room conversations. Therefore, overlap detection has become an important step …

Teager–kaiser energy operators for overlapped speech detection

N Shokouhi, JHL Hansen - IEEE/ACM Transactions on Audio …, 2017 - ieeexplore.ieee.org
Overlapped speech is referred to a monophonic audio signal in which at least two speakers
are present at the same time. In this study, the focus is on distinguishing overlapped from …

[PDF][PDF] Detecting overlapping speech with long short-term memory recurrent neural networks

JT Geiger, F Eyben, B Schuller… - … Interspeech 2013, 14th …, 2013 - mediatum.ub.tum.de
Detecting segments of overlapping speech (when two or more speakers are active at the
same time) is a challenging problem. Previously, mostly HMM-based systems have been …

Enhancing lstm rnn-based speech overlap detection by artificially mixed data

G Hagerer, V Pandit, F Eyben, B Schuller - Audio Engineering Society …, 2017 - aes.org
This paper presents a new method for Long Short-Term Memory Recurrent Neural Network
(LSTM) based speech overlap detection. To this end, speech overlap data is created …

From Modular to End-to-End Speaker Diarization

F Landini - arXiv preprint arXiv:2407.08752, 2024 - arxiv.org
Speaker diarization is usually referred to as the task that determines``who spoke when''in a
recording. Until a few years ago, all competitive approaches were modular. Systems based …

Overlapping speaker segmentation using multiple hypothesis tracking of fundamental frequency

AOT Hogg, C Evers, AH Moore… - IEEE/ACM Transactions …, 2021 - ieeexplore.ieee.org
This paper demonstrates how the harmonic structure of voiced speech can be exploited to
segment multiple overlapping speakers in a speaker diarization task. We explore how a …