Local information modeling with self-attention for speaker verification

B Han, Z Chen, Y Qian - ICASSP 2022-2022 IEEE International …, 2022 - ieeexplore.ieee.org
Transformer based on self attention mechanism has demonstrated its state-of-the-art
performance in most natural language processing (NLP) tasks, but it's not very competitive …

Improving fairness in speaker verification via group-adapted fusion network

H Shen, Y Yang, G Sun, R Langman… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
Modern speaker verification models use deep neural networks to encode utterance audio
into discriminative embedding vectors. During the training process, these networks are …

Optimizing tandem speaker verification and anti-spoofing systems

A Kanervisto, V Hautamäki, T Kinnunen… - … on Audio, Speech …, 2021 - ieeexplore.ieee.org
As automatic speaker verification (ASV) systems are vulnerable to spoofing attacks, they are
typically used in conjunction with spoofing countermeasure (CM) systems to improve …

Graph attention networks for speaker verification

J Jung, HS Heo, HJ Yu… - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
This work presents a novel back-end framework for speaker verification using graph
attention networks. Segment-wise speaker embeddings extracted from multiple crops within …

Text-independent voiceprint recognition via compact embedding of dilated deep convolutional neural networks

V Karthikeyan, SS Priyadharsini - Computers and Electrical Engineering, 2024 - Elsevier
In order to process speech, most state-of-the-art experimental methods employ
convolutional neural networks (CNNs), which operate on a continuous, 1-dimensional (1-D) …

Self-supervised representation learning with path integral clustering for speaker diarization

P Singh, S Ganapathy - IEEE/ACM Transactions on Audio …, 2021 - ieeexplore.ieee.org
Automatic speaker diarization techniques typically involve a two-stage processing approach
where audio segments of fixed duration are converted to vector representations in the first …

ChildAugment: Data augmentation methods for zero-resource children's speaker verification

VP Singh, M Sahidullah, T Kinnunen - The Journal of the Acoustical …, 2024 - pubs.aip.org
The accuracy of modern automatic speaker verification (ASV) systems, when trained
exclusively on adult data, drops substantially when applied to children's speech. The …

Self-supervised metric learning with graph clustering for speaker diarization

P Singh, S Ganapathy - 2021 IEEE Automatic Speech …, 2021 - ieeexplore.ieee.org
In this paper, we propose a novel algorithm for speaker diarization using metric learning for
graph based clustering. The graph clustering algorithms use an adjacency matrix consisting …

Anonymizing speech: Evaluating and designing speaker anonymization techniques

P Champion - arXiv preprint arXiv:2308.04455, 2023 - arxiv.org
The growing use of voice user interfaces has led to a surge in the collection and storage of
speech data. While data collection allows for the development of efficient tools powering …

[PDF][PDF] CTFALite: Lightweight Channel-specific Temporal and Frequency Attention Mechanism for Enhancing the Speaker Embedding Extractor.

Y Wei, J Du, H Liu, Q Wang - INTERSPEECH, 2022 - isca-archive.org
Attention mechanism provides an effective and plug-and-play feature enhancement module
for speaker embedding extractors. Attention-based pooling layers have been widely used to …