Speaker recognition based on deep learning: An overview
Speaker recognition is a task of identifying persons from their voices. Recently, deep
learning has dramatically revolutionized speaker recognition. However, there is lack of …
learning has dramatically revolutionized speaker recognition. However, there is lack of …
Zero-shot multi-speaker text-to-speech with state-of-the-art neural speaker embeddings
While speaker adaptation for end-to-end speech synthesis using speaker embeddings can
produce good speaker similarity for speakers seen during training, there remains a gap for …
produce good speaker similarity for speakers seen during training, there remains a gap for …
Improved rawnet with feature map scaling for text-independent speaker verification using raw waveforms
Recent advances in deep learning have facilitated the design of speaker verification
systems that directly input raw waveforms. For example, RawNet extracts speaker …
systems that directly input raw waveforms. For example, RawNet extracts speaker …
Meta-learning for short utterance speaker recognition with imbalance length pairs
In practical settings, a speaker recognition system needs to identify a speaker given a short
utterance, while the enrollment utterance may be relatively long. However, existing speaker …
utterance, while the enrollment utterance may be relatively long. However, existing speaker …
Improving multi-scale aggregation using feature pyramid module for robust speaker verification of variable-duration utterances
Currently, the most widely used approach for speaker verification is the deep speaker
embedding learning. In this approach, we obtain a speaker embedding vector by pooling …
embedding learning. In this approach, we obtain a speaker embedding vector by pooling …
Graph attentive feature aggregation for text-independent speaker verification
The objective of this paper is to combine multiple frame-level features into a single utterance-
level representation considering pair-wise relationships. For this purpose, we propose a …
level representation considering pair-wise relationships. For this purpose, we propose a …
Double multi-head attention for speaker verification
Most state-of-the-art Deep Learning systems for text-independent speaker verification are
based on speaker embedding extractors. These architectures are commonly composed of a …
based on speaker embedding extractors. These architectures are commonly composed of a …
Towards improving synthetic audio spoofing detection robustness via meta-learning and disentangled training with adversarial examples
Z Wang, JHL Hansen - IEEE Access, 2024 - ieeexplore.ieee.org
Advances in automatic speaker verification (ASV) promote research into the formulation of
spoofing detection systems for real-world applications. The performance of ASV systems can …
spoofing detection systems for real-world applications. The performance of ASV systems can …
Deep MOS predictor for synthetic speech using cluster-based modeling
While deep learning has made impressive progress in speech synthesis and voice
conversion, the assessment of the synthesized speech is still carried out by human …
conversion, the assessment of the synthesized speech is still carried out by human …
A unified deep learning framework for short-duration speaker verification in adverse environments
Speaker verification (SV) has recently attracted considerable research interest due to the
growing popularity of virtual assistants. At the same time, there is an increasing requirement …
growing popularity of virtual assistants. At the same time, there is an increasing requirement …