Voices obscured in complex environmental settings (voices) corpus

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier

The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

被引用次数：80 相关文章所有 6 个版本

[PDF] arxiv.org

Speaker recognition based on deep learning: An overview

Z Bai, XL Zhang - Neural Networks, 2021 - Elsevier

Speaker recognition is a task of identifying persons from their voices. Recently, deep
learning has dramatically revolutionized speaker recognition. However, there is lack of …

被引用次数：327 相关文章所有 9 个版本

[PDF] arxiv.org

CHiME-6 challenge: Tackling multispeaker speech recognition for unsegmented recordings

S Watanabe, M Mandel, J Barker, E Vincent… - arXiv preprint arXiv …, 2020 - arxiv.org

Following the success of the 1st, 2nd, 3rd, 4th and 5th CHiME challenges we organize the
6th CHiME Speech Separation and Recognition Challenge (CHiME-6). The new challenge …

被引用次数：294 相关文章所有 7 个版本

[PDF] ieee.org

A survey of speaker recognition: Fundamental theories, recognition methods and opportunities

MM Kabir, MF Mridha, J Shin, I Jahan, AQ Ohi - IEEE Access, 2021 - ieeexplore.ieee.org

Humans can identify a speaker by listening to their voice, over the telephone, or on any
digital devices. Acquiring this congenital human competency, authentication technologies …

被引用次数：86 相关文章所有 5 个版本

[PDF] arxiv.org

The chime-7 dasr challenge: Distant meeting transcription with multiple devices in diverse scenarios

S Cornell, M Wiesner, S Watanabe, D Raj… - arXiv preprint arXiv …, 2023 - arxiv.org

The CHiME challenges have played a significant role in the development and evaluation of
robust speech recognition (ASR) systems. We introduce the CHiME-7 distant ASR (DASR) …

被引用次数：28 相关文章所有 7 个版本

[PDF] arxiv.org

Augmentation adversarial training for self-supervised speaker recognition

J Huh, HS Heo, J Kang, S Watanabe… - arXiv preprint arXiv …, 2020 - arxiv.org

The goal of this work is to train robust speaker recognition models without speaker labels.
Recent works on unsupervised speaker representations are based on contrastive learning …

被引用次数：77 相关文章所有 3 个版本

[PDF] acm.org

Augmented datasheets for speech datasets and ethical decision-making

O Papakyriakopoulos, ASG Choi, W Thong… - Proceedings of the …, 2023 - dl.acm.org

Speech datasets are crucial for training Speech Language Technologies (SLT); however,
the lack of diversity of the underlying training data can lead to serious limitations in building …

被引用次数：15 相关文章所有 5 个版本

[PDF] arxiv.org

The voices from a distance challenge 2019 evaluation plan

MK Nandwana, J Van Hout, M McLaren… - arXiv preprint arXiv …, 2019 - arxiv.org

The" VOiCES from a Distance Challenge 2019" is designed to foster research in the area of
speaker recognition and automatic speech recognition (ASR) with the special focus on …

被引用次数：99 相关文章所有 6 个版本

[PDF] arxiv.org

The INTERSPEECH 2020 far-field speaker verification challenge

X Qin, M Li, H Bu, W Rao, RK Das… - arXiv preprint arXiv …, 2020 - arxiv.org

The INTERSPEECH 2020 Far-Field Speaker Verification Challenge (FFSVC 2020)
addresses three different research problems under well-defined conditions: far-field text …

被引用次数：55 相关文章所有 13 个版本

[HTML] mdpi.com

[HTML][HTML] Novel speech recognition systems applied to forensics within child exploitation: Wav2vec2. 0 vs. whisper

JC Vásquez-Correa, A Álvarez Muniain - Sensors, 2023 - mdpi.com

The growth in online child exploitation material is a significant challenge for European Law
Enforcement Agencies (LEAs). One of the most important sources of such online information …

被引用次数：20 相关文章所有 11 个版本