A review of deep learning techniques for speech processing

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier
The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

Speaker recognition based on deep learning: An overview

Z Bai, XL Zhang - Neural Networks, 2021 - Elsevier
Speaker recognition is a task of identifying persons from their voices. Recently, deep
learning has dramatically revolutionized speaker recognition. However, there is lack of …

CHiME-6 challenge: Tackling multispeaker speech recognition for unsegmented recordings

S Watanabe, M Mandel, J Barker, E Vincent… - arXiv preprint arXiv …, 2020 - arxiv.org
Following the success of the 1st, 2nd, 3rd, 4th and 5th CHiME challenges we organize the
6th CHiME Speech Separation and Recognition Challenge (CHiME-6). The new challenge …

A survey of speaker recognition: Fundamental theories, recognition methods and opportunities

MM Kabir, MF Mridha, J Shin, I Jahan, AQ Ohi - IEEE Access, 2021 - ieeexplore.ieee.org
Humans can identify a speaker by listening to their voice, over the telephone, or on any
digital devices. Acquiring this congenital human competency, authentication technologies …

The chime-7 dasr challenge: Distant meeting transcription with multiple devices in diverse scenarios

S Cornell, M Wiesner, S Watanabe, D Raj… - arXiv preprint arXiv …, 2023 - arxiv.org
The CHiME challenges have played a significant role in the development and evaluation of
robust speech recognition (ASR) systems. We introduce the CHiME-7 distant ASR (DASR) …

Augmentation adversarial training for self-supervised speaker recognition

J Huh, HS Heo, J Kang, S Watanabe… - arXiv preprint arXiv …, 2020 - arxiv.org
The goal of this work is to train robust speaker recognition models without speaker labels.
Recent works on unsupervised speaker representations are based on contrastive learning …

Augmented datasheets for speech datasets and ethical decision-making

O Papakyriakopoulos, ASG Choi, W Thong… - Proceedings of the …, 2023 - dl.acm.org
Speech datasets are crucial for training Speech Language Technologies (SLT); however,
the lack of diversity of the underlying training data can lead to serious limitations in building …

The voices from a distance challenge 2019 evaluation plan

MK Nandwana, J Van Hout, M McLaren… - arXiv preprint arXiv …, 2019 - arxiv.org
The" VOiCES from a Distance Challenge 2019" is designed to foster research in the area of
speaker recognition and automatic speech recognition (ASR) with the special focus on …

The INTERSPEECH 2020 far-field speaker verification challenge

X Qin, M Li, H Bu, W Rao, RK Das… - arXiv preprint arXiv …, 2020 - arxiv.org
The INTERSPEECH 2020 Far-Field Speaker Verification Challenge (FFSVC 2020)
addresses three different research problems under well-defined conditions: far-field text …

[HTML][HTML] Novel speech recognition systems applied to forensics within child exploitation: Wav2vec2. 0 vs. whisper

JC Vásquez-Correa, A Álvarez Muniain - Sensors, 2023 - mdpi.com
The growth in online child exploitation material is a significant challenge for European Law
Enforcement Agencies (LEAs). One of the most important sources of such online information …