Local information modeling with self-attention for speaker verification
Transformer based on self attention mechanism has demonstrated its state-of-the-art
performance in most natural language processing (NLP) tasks, but it's not very competitive …
performance in most natural language processing (NLP) tasks, but it's not very competitive …
Improving fairness in speaker verification via group-adapted fusion network
Modern speaker verification models use deep neural networks to encode utterance audio
into discriminative embedding vectors. During the training process, these networks are …
into discriminative embedding vectors. During the training process, these networks are …
Optimizing tandem speaker verification and anti-spoofing systems
As automatic speaker verification (ASV) systems are vulnerable to spoofing attacks, they are
typically used in conjunction with spoofing countermeasure (CM) systems to improve …
typically used in conjunction with spoofing countermeasure (CM) systems to improve …
Graph attention networks for speaker verification
This work presents a novel back-end framework for speaker verification using graph
attention networks. Segment-wise speaker embeddings extracted from multiple crops within …
attention networks. Segment-wise speaker embeddings extracted from multiple crops within …
Text-independent voiceprint recognition via compact embedding of dilated deep convolutional neural networks
V Karthikeyan, SS Priyadharsini - Computers and Electrical Engineering, 2024 - Elsevier
In order to process speech, most state-of-the-art experimental methods employ
convolutional neural networks (CNNs), which operate on a continuous, 1-dimensional (1-D) …
convolutional neural networks (CNNs), which operate on a continuous, 1-dimensional (1-D) …
Self-supervised representation learning with path integral clustering for speaker diarization
P Singh, S Ganapathy - IEEE/ACM Transactions on Audio …, 2021 - ieeexplore.ieee.org
Automatic speaker diarization techniques typically involve a two-stage processing approach
where audio segments of fixed duration are converted to vector representations in the first …
where audio segments of fixed duration are converted to vector representations in the first …
ChildAugment: Data augmentation methods for zero-resource children's speaker verification
The accuracy of modern automatic speaker verification (ASV) systems, when trained
exclusively on adult data, drops substantially when applied to children's speech. The …
exclusively on adult data, drops substantially when applied to children's speech. The …
Self-supervised metric learning with graph clustering for speaker diarization
P Singh, S Ganapathy - 2021 IEEE Automatic Speech …, 2021 - ieeexplore.ieee.org
In this paper, we propose a novel algorithm for speaker diarization using metric learning for
graph based clustering. The graph clustering algorithms use an adjacency matrix consisting …
graph based clustering. The graph clustering algorithms use an adjacency matrix consisting …
Anonymizing speech: Evaluating and designing speaker anonymization techniques
P Champion - arXiv preprint arXiv:2308.04455, 2023 - arxiv.org
The growing use of voice user interfaces has led to a surge in the collection and storage of
speech data. While data collection allows for the development of efficient tools powering …
speech data. While data collection allows for the development of efficient tools powering …
[PDF][PDF] CTFALite: Lightweight Channel-specific Temporal and Frequency Attention Mechanism for Enhancing the Speaker Embedding Extractor.
Y Wei, J Du, H Liu, Q Wang - INTERSPEECH, 2022 - isca-archive.org
Attention mechanism provides an effective and plug-and-play feature enhancement module
for speaker embedding extractors. Attention-based pooling layers have been widely used to …
for speaker embedding extractors. Attention-based pooling layers have been widely used to …