Speaker recognition based on deep learning: An overview

Z Bai, XL Zhang - Neural Networks, 2021 - Elsevier
Speaker recognition is a task of identifying persons from their voices. Recently, deep
learning has dramatically revolutionized speaker recognition. However, there is lack of …

A comprehensive survey on model compression and acceleration

T Choudhary, V Mishra, A Goswami… - Artificial Intelligence …, 2020 - Springer
In recent years, machine learning (ML) and deep learning (DL) have shown remarkable
improvement in computer vision, natural language processing, stock prediction, forecasting …

Lora: Low-rank adaptation of large language models

EJ Hu, Y Shen, P Wallis, Z Allen-Zhu, Y Li… - arXiv preprint arXiv …, 2021 - arxiv.org
An important paradigm of natural language processing consists of large-scale pre-training
on general domain data and adaptation to particular tasks or domains. As we pre-train larger …

Ecapa-tdnn: Emphasized channel attention, propagation and aggregation in tdnn based speaker verification

B Desplanques, J Thienpondt, K Demuynck - arXiv preprint arXiv …, 2020 - arxiv.org
Current speaker verification techniques rely on a neural network to extract speaker
representations. The successful x-vector architecture is a Time Delay Neural Network …

A survey on model compression for large language models

X Zhu, J Li, Y Liu, C Ma, W Wang - arXiv preprint arXiv:2308.07633, 2023 - arxiv.org
Large Language Models (LLMs) have revolutionized natural language processing tasks with
remarkable success. However, their formidable size and computational demands present …

Wenetspeech: A 10000+ hours multi-domain mandarin corpus for speech recognition

B Zhang, H Lv, P Guo, Q Shao, C Yang… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
In this paper, we present WenetSpeech, a multi-domain Mandarin corpus consisting of
10000+ hours high-quality labeled speech, 2400+ hours weakly labeled speech, and about …

CHiME-6 challenge: Tackling multispeaker speech recognition for unsegmented recordings

S Watanabe, M Mandel, J Barker, E Vincent… - arXiv preprint arXiv …, 2020 - arxiv.org
Following the success of the 1st, 2nd, 3rd, 4th and 5th CHiME challenges we organize the
6th CHiME Speech Separation and Recognition Challenge (CHiME-6). The new challenge …

Towards edge computing in intelligent manufacturing: Past, present and future

G Nain, KK Pattanaik, GK Sharma - Journal of Manufacturing Systems, 2022 - Elsevier
Abstract Industry 4.0 (I4. 0) is the fourth industrial revolution and a synonym for intelligent
manufacturing. It drives the convergence of several cutting-edge technologies to provoke …

Jasper: An end-to-end convolutional neural acoustic model

J Li, V Lavrukhin, B Ginsburg, R Leary… - arXiv preprint arXiv …, 2019 - arxiv.org
In this paper, we report state-of-the-art results on LibriSpeech among end-to-end speech
recognition models without any external training data. Our model, Jasper, uses only 1D …

TED-LIUM 3: Twice as much data and corpus repartition for experiments on speaker adaptation

F Hernandez, V Nguyen, S Ghannay… - Speech and Computer …, 2018 - Springer
In this paper, we present TED-LIUM release 3 corpus (TED-LIUM 3 is available on
https://lium. univ-lemans. fr/ted-lium3/) dedicated to speech recognition in English, which …