Speaker recognition based on deep learning: An overview
Speaker recognition is a task of identifying persons from their voices. Recently, deep
learning has dramatically revolutionized speaker recognition. However, there is lack of …
learning has dramatically revolutionized speaker recognition. However, there is lack of …
Mel frequency cepstral coefficient and its applications: A review
ZK Abdul, AK Al-Talabani - IEEE Access, 2022 - ieeexplore.ieee.org
Feature extraction and representation has significant impact on the performance of any
machine learning method. Mel Frequency Cepstrum Coefficient (MFCC) is designed to …
machine learning method. Mel Frequency Cepstrum Coefficient (MFCC) is designed to …
The INTERSPEECH 2020 far-field speaker verification challenge
The INTERSPEECH 2020 Far-Field Speaker Verification Challenge (FFSVC 2020)
addresses three different research problems under well-defined conditions: far-field text …
addresses three different research problems under well-defined conditions: far-field text …
[PDF][PDF] Far-Field End-to-End Text-Dependent Speaker Verification Based on Mixed Training Data with Transfer Learning and Enrollment Data Augmentation.
In this paper, we focus on the far-field end-to-end textdependent speaker verification task
with a small-scale far-field text dependent dataset and a large scale close-talking text …
with a small-scale far-field text dependent dataset and a large scale close-talking text …
Robust multi-channel far-field speaker verification under different in-domain data availability scenarios
The popularity and application of smart home devices have made far-field speaker
verification an urgent need. However, speaker verification performance is unsatisfactory …
verification an urgent need. However, speaker verification performance is unsatisfactory …
The dku audio-visual wake word spotting system for the 2021 misp challenge
This paper describes the system developed by the DKU team for the MISP Challenge 2021.
We present a two-stage approach consisting of end-to-end neural networks for the audio …
We present a two-stage approach consisting of end-to-end neural networks for the audio …
VE-KWS: Visual modality enhanced end-to-end keyword spotting
The performance of the keyword spotting (KWS) system based on audio modality, commonly
measured in false alarms and false rejects, degrades significantly under the far field and …
measured in false alarms and false rejects, degrades significantly under the far field and …
Deep feature cyclegans: Speaker identity preserving non-parallel microphone-telephone domain adaptation for speaker verification
With the increase in the availability of speech from varied domains, it is imperative to use
such out-of-domain data to improve existing speech systems. Domain adaptation is a …
such out-of-domain data to improve existing speech systems. Domain adaptation is a …
Royalflush speaker diarization system for icassp 2022 multi-channel multi-party meeting transcription challenge
This paper describes the Royalflush speaker diarization system submitted to the Multi-
channel Multi-party Meeting Transcription Challenge (M2MeT). Our system comprises …
channel Multi-party Meeting Transcription Challenge (M2MeT). Our system comprises …
Multisv: Dataset for far-field multi-channel speaker verification
Motivated by unconsolidated data situation and the lack of a standard benchmark in the
field, we complement our previous efforts and present a comprehensive corpus designed for …
field, we complement our previous efforts and present a comprehensive corpus designed for …