A review of deep learning techniques for speech processing

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier
The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

Speaker recognition based on deep learning: An overview

Z Bai, XL Zhang - Neural Networks, 2021 - Elsevier
Speaker recognition is a task of identifying persons from their voices. Recently, deep
learning has dramatically revolutionized speaker recognition. However, there is lack of …

Speaker recognition for multi-speaker conversations using x-vectors

D Snyder, D Garcia-Romero, G Sell… - ICASSP 2019-2019 …, 2019 - ieeexplore.ieee.org
Recently, deep neural networks that map utterances to fixed-dimensional embeddings have
emerged as the state-of-the-art in speaker recognition. Our prior work introduced x-vectors …

The third DIHARD diarization challenge

N Ryant, P Singh, V Krishnamohan, R Varma… - arXiv preprint arXiv …, 2020 - arxiv.org
DIHARD III was the third in a series of speaker diarization challenges intended to improve
the robustness of diarization systems to variability in recording equipment, noise conditions …

Target-speaker voice activity detection: a novel approach for multi-speaker diarization in a dinner party scenario

I Medennikov, M Korenevsky, T Prisyach… - arXiv preprint arXiv …, 2020 - arxiv.org
Speaker diarization for real-life scenarios is an extremely challenging problem. Widely used
clustering-based diarization approaches perform rather poorly in such conditions, mainly …

The second dihard diarization challenge: Dataset, task, and baselines

N Ryant, K Church, C Cieri, A Cristia, J Du… - arXiv preprint arXiv …, 2019 - arxiv.org
This paper introduces the second DIHARD challenge, the second in a series of speaker
diarization challenges intended to improve the robustness of diarization systems to variation …

The speakin system for voxceleb speaker recognition challange 2021

M Zhao, Y Ma, M Liu, M Xu - arXiv preprint arXiv:2109.01989, 2021 - arxiv.org
This report describes our submission to the track 1 and track 2 of the VoxCeleb Speaker
Recognition Challenge 2021 (VoxSRC 2021). Both track 1 and track 2 share the same …

Voxsrc 2020: The second voxceleb speaker recognition challenge

A Nagrani, JS Chung, J Huh, A Brown, E Coto… - arXiv preprint arXiv …, 2020 - arxiv.org
We held the second installment of the VoxCeleb Speaker Recognition Challenge in
conjunction with Interspeech 2020. The goal of this challenge was to assess how well …

Voxsrc 2021: The third voxceleb speaker recognition challenge

A Brown, J Huh, JS Chung, A Nagrani… - arXiv preprint arXiv …, 2022 - arxiv.org
The third instalment of the VoxCeleb Speaker Recognition Challenge was held in
conjunction with Interspeech 2021. The aim of this challenge was to assess how well current …

Overlap-aware diarization: Resegmentation using neural end-to-end overlapped speech detection

L Bullock, H Bredin… - ICASSP 2020-2020 IEEE …, 2020 - ieeexplore.ieee.org
We address the problem of effectively handling overlapping speech in a diarization system.
First, we detail a neural Long Short-Term Memory-based architecture for overlap detection …