Multi-Channel Training for End-to-End Speaker Recognition Under Reverberant and Noisy Environment.

Z Bai, XL Zhang - Neural Networks, 2021 - Elsevier

Speaker recognition is a task of identifying persons from their voices. Recently, deep
learning has dramatically revolutionized speaker recognition. However, there is lack of …

被引用次数：419 相关文章所有 9 个版本

[PDF] ieee.org

Mel frequency cepstral coefficient and its applications: A review

ZK Abdul, AK Al-Talabani - IEEE Access, 2022 - ieeexplore.ieee.org

Feature extraction and representation has significant impact on the performance of any
machine learning method. Mel Frequency Cepstrum Coefficient (MFCC) is designed to …

被引用次数：225 相关文章所有 3 个版本

[PDF] arxiv.org

The INTERSPEECH 2020 far-field speaker verification challenge

X Qin, M Li, H Bu, W Rao, RK Das… - arXiv preprint arXiv …, 2020 - arxiv.org

The INTERSPEECH 2020 Far-Field Speaker Verification Challenge (FFSVC 2020)
addresses three different research problems under well-defined conditions: far-field text …

被引用次数：57 相关文章所有 14 个版本

[PDF] isca-archive.org

[PDF][PDF] Far-Field End-to-End Text-Dependent Speaker Verification Based on Mixed Training Data with Transfer Learning and Enrollment Data Augmentation.

X Qin, D Cai, M Li - Interspeech, 2019 - isca-archive.org

In this paper, we focus on the far-field end-to-end textdependent speaker verification task
with a small-scale far-field text dependent dataset and a large scale close-talking text …

被引用次数：48 相关文章所有 7 个版本

Robust multi-channel far-field speaker verification under different in-domain data availability scenarios

X Qin, D Cai, M Li - IEEE/ACM Transactions on Audio, Speech …, 2022 - ieeexplore.ieee.org

The popularity and application of smart home devices have made far-field speaker
verification an urgent need. However, speaker verification performance is unsatisfactory …

被引用次数：13 相关文章所有 2 个版本

[PDF] duke.edu

The dku audio-visual wake word spotting system for the 2021 misp challenge

M Cheng, H Wang, Y Wang, M Li - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org

This paper describes the system developed by the DKU team for the MISP Challenge 2021.
We present a two-stage approach consisting of end-to-end neural networks for the audio …

被引用次数：14 相关文章所有 4 个版本

[PDF] arxiv.org

VE-KWS: Visual modality enhanced end-to-end keyword spotting

A Zhang, H Wang, P Guo, Y Fu, L Xie… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org

The performance of the keyword spotting (KWS) system based on audio modality, commonly
measured in false alarms and false rejects, degrades significantly under the far field and …

被引用次数：7 相关文章所有 3 个版本

[PDF] arxiv.org

Deep feature cyclegans: Speaker identity preserving non-parallel microphone-telephone domain adaptation for speaker verification

S Kataria, J Villalba, P Żelasko… - arXiv preprint arXiv …, 2021 - arxiv.org

With the increase in the availability of speech from varied domains, it is imperative to use
such out-of-domain data to improve existing speech systems. Domain adaptation is a …

被引用次数：15 相关文章所有 8 个版本

[PDF] arxiv.org

Royalflush speaker diarization system for icassp 2022 multi-channel multi-party meeting transcription challenge

J Tian, X Hu, X Xu - arXiv preprint arXiv:2202.04814, 2022 - arxiv.org

This paper describes the Royalflush speaker diarization system submitted to the Multi-
channel Multi-party Meeting Transcription Challenge (M2MeT). Our system comprises …

被引用次数：9 相关文章所有 3 个版本

[PDF] arxiv.org

Multisv: Dataset for far-field multi-channel speaker verification

L Mošner, O Plchot, L Burget… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

Motivated by unconsolidated data situation and the lack of a standard benchmark in the
field, we complement our previous efforts and present a comprehensive corpus designed for …

被引用次数：10 相关文章所有 3 个版本