Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation
The performance of spoofing countermeasure systems depends fundamentally upon the use
of sufficiently representative training data. With this usually being limited, current solutions …
of sufficiently representative training data. With this usually being limited, current solutions …
Speech and speaker recognition using raw waveform modeling for adult and children's speech: A comprehensive review
Conventionally, the extraction of hand-crafted acoustic features has been separated from the
task of establishing robust machine-learning models in speech processing. The manual …
task of establishing robust machine-learning models in speech processing. The manual …
Pushing the limits of raw waveform speaker recognition
In recent years, speaker recognition systems based on raw waveform inputs have received
increasing attention. However, the performance of such systems are typically inferior to the …
increasing attention. However, the performance of such systems are typically inferior to the …
Frequency and multi-scale selective kernel attention for speaker verification
The majority of recent state-of-the-art speaker verification architectures adopt multi-scale
processing and frequency-channel attention mechanisms. Convolutional layers of these …
processing and frequency-channel attention mechanisms. Convolutional layers of these …
Short-segment speaker verification using ecapa-tdnn with multi-resolution encoder
Time-domain approaches have shown the potential to improve the performance of speaker
verification, but still predominant approaches utilize hand-crafted features such as the mel …
verification, but still predominant approaches utilize hand-crafted features such as the mel …
RSKNet-MTSP: Effective and portable deep architecture for speaker verification
Y Wu, C Guo, J Zhao, X Jin, J Xu - Neurocomputing, 2022 - Elsevier
The convolutional neural network (CNN) based approaches have shown great success for
speaker verification (SV) tasks, where modeling long temporal context and reducing …
speaker verification (SV) tasks, where modeling long temporal context and reducing …
Fisher ratio-based multi-domain frame-level feature aggregation for short utterance speaker verification
As the durations of the short utterances are small, it is difficult to learn sufficient information
to distinguish the person, thus, short utterance speaker recognition is highly challenging. In …
to distinguish the person, thus, short utterance speaker recognition is highly challenging. In …
Multi-level attention network: Mixed time–frequency channel attention and multi-scale self-attentive standard deviation pooling for speaker recognition
L Deng, F Deng, K Zhou, P Jiang, G Zhang… - … Applications of Artificial …, 2024 - Elsevier
In this paper, we propose a more efficient lightweight speaker recognition network, the multi-
level attention network (MANet). MANet aims to generate more robust and discriminative …
level attention network (MANet). MANet aims to generate more robust and discriminative …
Voiceextender: Short-Utterance Text-Independent Speaker Verification With Guided Diffusion Model
Y He, Z Kang, J Wang, J Peng… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org
Speaker verification (SV) performance deteriorates as utterances become shorter. To this
end, we propose a new architecture called VoiceExtender which provides a promising …
end, we propose a new architecture called VoiceExtender which provides a promising …
End-to-end deep speaker embedding learning using multi-scale attentional fusion and graph neural networks
HB Kashani, S Jazmi - Expert Systems with Applications, 2023 - Elsevier
As an attractive research in biometric authentication, Text Independent Speaker Verification
(TI-SV) problem aims to specify whether two given unconstrained utterances come from the …
(TI-SV) problem aims to specify whether two given unconstrained utterances come from the …