Speaker recognition based on deep learning: An overview

Z Bai, XL Zhang - Neural Networks, 2021 - Elsevier
Speaker recognition is a task of identifying persons from their voices. Recently, deep
learning has dramatically revolutionized speaker recognition. However, there is lack of …

Adversarial attack and defense strategies for deep speaker recognition systems

A Jati, CC Hsu, M Pal, R Peri, W AbdAlmageed… - Computer Speech & …, 2021 - Elsevier
Robust speaker recognition, including in the presence of malicious attacks, is becoming
increasingly important and essential, especially due to the proliferation of smart speakers …

Individual identification in acoustic recordings

E Knight, T Rhinehart, DR de Zwaan, MJ Weldy… - Trends in Ecology & …, 2024 - cell.com
Recent advances in bioacoustics combined with acoustic individual identification (AIID)
could open frontiers for ecological and evolutionary research because traditional methods of …

Neural mos prediction for synthesized speech using multi-task learning with spoofing detection and spoofing type classification

Y Choi, Y Jung, H Kim - 2021 IEEE Spoken Language …, 2021 - ieeexplore.ieee.org
Several studies have proposed deep-learning-based models to predict the mean opinion
score (MOS) of synthesized speech, showing the possibility of replacing human raters …

Robust multi-channel far-field speaker verification under different in-domain data availability scenarios

X Qin, D Cai, M Li - IEEE/ACM Transactions on Audio, Speech …, 2022 - ieeexplore.ieee.org
The popularity and application of smart home devices have made far-field speaker
verification an urgent need. However, speaker verification performance is unsatisfactory …

Adversarial defense for deep speaker recognition using hybrid adversarial training

M Pal, A Jati, R Peri, CC Hsu… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
Deep neural network based speaker recognition systems can easily be deceived by an
adversary using minuscule imperceptible perturbations to the input speech samples. These …

Robust speaker recognition using unsupervised adversarial invariance

R Peri, M Pal, A Jati, K Somandepalli… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org
In this paper, we address the problem of speaker recognition in challenging acoustic
conditions using a novel method to extract robust speaker-discriminative speech …

Meta-learning with latent space clustering in generative adversarial network for speaker diarization

M Pal, M Kumar, R Peri, TJ Park, SH Kim… - … ACM transactions on …, 2021 - ieeexplore.ieee.org
The performance of most speaker diarization systems with x-vector embeddings is both
vulnerable to noisy environments and lacks domain robustness. Earlier work on speaker …

Temporal dynamics of workplace acoustic scenes: Egocentric analysis and prediction

A Jati, A Nadarajan, R Peri, K Mundnich… - … on Audio, Speech …, 2021 - ieeexplore.ieee.org
Identification of the acoustic environment from an audio recording, also known as acoustic
scene classification, is an active area of research. In this paper, we study dynamically …

[PDF][PDF] Deep speaker embedding with frame-constrained training strategy for speaker verification.

B Gu - INTERSPEECH, 2022 - isca-archive.org
Speech signals contain a lot of side information (content, stress, etc.), besides the voiceprint
statistics. The session-variablility poses a huge challenge for modeling speaker …