Speaker recognition based on deep learning: An overview
Speaker recognition is a task of identifying persons from their voices. Recently, deep
learning has dramatically revolutionized speaker recognition. However, there is lack of …
learning has dramatically revolutionized speaker recognition. However, there is lack of …
Deep learning methods in speaker recognition: a review
This paper summarizes the applied deep learning practices in the field of speaker
recognition, both verification and identification. Speaker recognition has been a widely used …
recognition, both verification and identification. Speaker recognition has been a widely used …
MFA: TDNN with multi-scale frequency-channel attention for text-independent speaker verification with short utterances
The time delay neural network (TDNN) represents one of the state-of-the-art of neural
solutions to text-independent speaker verification. However, they require a large number of …
solutions to text-independent speaker verification. However, they require a large number of …
[PDF][PDF] Densely Connected Time Delay Neural Network for Speaker Verification.
Time delay neural network (TDNN) has been widely used in speaker verification tasks.
Recently, two TDNN-based models, including extended TDNN (E-TDNN) and factorized …
Recently, two TDNN-based models, including extended TDNN (E-TDNN) and factorized …
Multi-view self-attention based transformer for speaker recognition
Initially developed for natural language processing (NLP), Transformer model is now widely
used for speech processing tasks such as speaker recognition, due to its powerful sequence …
used for speech processing tasks such as speaker recognition, due to its powerful sequence …
Improving multi-scale aggregation using feature pyramid module for robust speaker verification of variable-duration utterances
Currently, the most widely used approach for speaker verification is the deep speaker
embedding learning. In this approach, we obtain a speaker embedding vector by pooling …
embedding learning. In this approach, we obtain a speaker embedding vector by pooling …
[PDF][PDF] Vector-based attentive pooling for text-independent speaker verification.
Y Wu, C Guo, H Gao, X Hou, J Xu - Interspeech, 2020 - interspeech2020.org
The pooling mechanism plays an important role in deep neural network based systems for
text-independent speaker verification, which aggregates the variable-length frame-level …
text-independent speaker verification, which aggregates the variable-length frame-level …
D-MONA: A dilated mixed-order non-local attention network for speaker and language recognition
Attention-based convolutional neural network (CNN) models are increasingly being adopted
for speaker and language recognition (SR/LR) tasks. These include time, frequency, spatial …
for speaker and language recognition (SR/LR) tasks. These include time, frequency, spatial …
An effective deep embedding learning method based on dense-residual networks for speaker verification
Y Liu, Y Song, I McLoughlin, L Liu… - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
In this paper, we present an effective end-to-end deep embedding learning method based
on Dense-Residual networks, which combine the advantages of a densely connected …
on Dense-Residual networks, which combine the advantages of a densely connected …
[HTML][HTML] Global–local self-attention based transformer for speaker verification
F Xie, D Zhang, C Liu - Applied Sciences, 2022 - mdpi.com
Transformer models are now widely used for speech processing tasks due to their powerful
sequence modeling capabilities. Previous work determined an efficient way to model …
sequence modeling capabilities. Previous work determined an efficient way to model …