Speaker recognition based on deep learning: An overview

Z Bai, XL Zhang - Neural Networks, 2021 - Elsevier
Speaker recognition is a task of identifying persons from their voices. Recently, deep
learning has dramatically revolutionized speaker recognition. However, there is lack of …

STC antispoofing systems for the ASVspoof2019 challenge

G Lavrentyeva, S Novoselov, A Tseren… - arXiv preprint arXiv …, 2019 - arxiv.org
This paper describes the Speech Technology Center (STC) antispoofing systems submitted
to the ASVspoof 2019 challenge. The ASVspoof2019 is the extended version of the previous …

Self multi-head attention for speaker recognition

M India, P Safari, J Hernando - arXiv preprint arXiv:1906.09890, 2019 - arxiv.org
Most state-of-the-art Deep Learning (DL) approaches for speaker recognition work on a
short utterance level. Given the speech signal, these algorithms extract a sequence of …

Deep speaker embedding learning with multi-level pooling for text-independent speaker verification

Y Tang, G Ding, J Huang, X He… - ICASSP 2019-2019 IEEE …, 2019 - ieeexplore.ieee.org
This paper aims to improve the widely used deep speaker embedding x-vector model. We
propose the following improvements:(1) a hybrid neural network structure using both time …

[PDF][PDF] MagNetO: X-vector Magnitude Estimation Network plus Offset for Improved Speaker Recognition.

D Garcia-Romero, G Sell, A Mccree - Odyssey, 2020 - isca-archive.org
We present a magnitude estimation network that is combined with a modified ResNet x-
vector system to generate embeddings whose inner product is able to produce calibrated …

[PDF][PDF] The STC system for the CHiME-6 challenge

I Medennikov, M Korenevsky, T Prisyach… - … 2020 Workshop on …, 2020 - isca-archive.org
This paper is a description of the Speech Technology Center (STC) systems for the CHiME-6
challenge aimed at multimicrophone multi-speaker speech recognition and diarization in a …

Deep speaker embeddings for far-field speaker recognition on short utterances

A Gusev, V Volokhov, T Andzhukaev… - arXiv preprint arXiv …, 2020 - arxiv.org
Speaker recognition systems based on deep speaker embeddings have achieved
significant performance in controlled conditions according to the results obtained for early …

[PDF][PDF] x-vector DNN refinement with full-length recordings for speaker recognition.

D Garcia-Romero, D Snyder, G Sell, A McCree… - Interspeech, 2019 - danielpovey.com
State-of-the-art text-independent speaker recognition systems for long recordings (a few
minutes) are based on deep neural network (DNN) speaker embeddings. Current …

Voice-indistinguishability: Protecting voiceprint in privacy-preserving speech data release

Y Han, S Li, Y Cao, Q Ma… - 2020 IEEE International …, 2020 - ieeexplore.ieee.org
With the development of smart devices, such as the Amazon Echo and Apple's HomePod,
speech data have become a new dimension of big data. However, privacy and security …

JHU-HLTCOE system for the VoxSRC speaker recognition challenge

D Garcia-Romero, A McCree… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org
The VoxSRC speaker recognition challenge comprises data obtained from YouTube videos
of celebrity interviews in a wide range of recording environments. The challenge provides …