A review of speaker diarization: Recent advances with deep learning

TJ Park, N Kanda, D Dimitriadis, KJ Han… - Computer Speech & …, 2022 - Elsevier
Speaker diarization is a task to label audio or video recordings with classes that correspond
to speaker identity, or in short, a task to identify “who spoke when”. In the early years …

Convolutional neural network for speaker change detection in telephone speaker diarization system

M Hrúz, Z Zajíc - … Conference on Acoustics, Speech and Signal …, 2017 - ieeexplore.ieee.org
The aim of this paper is to propose a speaker change detection technique based on
Convolutional Neural Network (CNN) and evaluate its contribution to the performance of a …

Multitask detection of speaker changes, overlapping speech and voice activity using wav2vec 2.0

M Kunešová, Z Zajíc - ICASSP 2023-2023 IEEE International …, 2023 - ieeexplore.ieee.org
Self-supervised learning approaches have lately achieved great success on a broad
spectrum of machine learning problems. In the field of speech processing, one of the most …

[PDF][PDF] Speaker Diarization Using Convolutional Neural Network for Statistics Accumulation Refinement.

Z Zajic, M Hrúz, L Müller - INTERSPEECH, 2017 - kky-sw.zcu.cz
The aim of this paper is to investigate the benefit of information from a speaker change
detection system based on Convolutional Neural Network (CNN) when applied to the …

Linguistically aided speaker diarization using speaker role information

N Flemotomos, P Georgiou, S Narayanan - arXiv preprint arXiv …, 2019 - arxiv.org
Speaker diarization relies on the assumption that speech segments corresponding to a
particular speaker are concentrated in a specific region of the speaker space; a region which …

[PDF][PDF] ZCU-NTIS speaker diarization system for the DIHARD 2018 challenge

Z Zajıc, M Kunešová, J Zelinka, M Hrúz - Proc. Interspeech, 2018 - isca-archive.org
In this paper, we present the system developed by the team from the New Technologies for
the Information Society (NTIS) research center of the University of West Bohemia, for the …

A memory augmented architecture for continuous speaker identification in meetings

N Flemotomos, D Dimitriadis - ICASSP 2020-2020 IEEE …, 2020 - ieeexplore.ieee.org
We introduce and analyze a novel approach to the problem of speaker identification in multi-
party recorded meetings. Given a speech segment and a set of available candidate profiles …

Multitask Speech Recognition and Speaker Change Detection for Unknown Number of Speakers

S Kumar, S Madikeri, I Nigmatulina… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org
Traditionally, automatic speech recognition (ASR) and speaker change detection (SCD)
systems have been independently trained to generate comprehensive transcripts …

Программный комплекс для автоматизации моделирования сегментации речевых сигналов и вокальных исполнений

АЮ Якимук, АА Конев, АО Осипов - iPolytech Journal, 2017 - cyberleninka.ru
ЦЕЛЬ. В данной работе рассматривается проблема автоматизации моделирования
сегментации речевых сигналов и вокальных исполнений. МЕТОДЫ. Специфика …

UWB-NTIS speaker diarization system for the DIHARD II 2019 challenge

Z Zajíc, M Kunešová, M Hrúz, J Vaněk - arXiv preprint arXiv:1905.11276, 2019 - arxiv.org
In this paper, we present our system developed by the team from the New Technologies for
the Information Society (NTIS) research center of the University of West Bohemia in Pilsen …