Semi-orthogonal low-rank matrix factorization for deep neural networks.

Z Bai, XL Zhang - Neural Networks, 2021 - Elsevier

Speaker recognition is a task of identifying persons from their voices. Recently, deep
learning has dramatically revolutionized speaker recognition. However, there is lack of …

被引用次数：332 相关文章所有 9 个版本

A comprehensive survey on model compression and acceleration

T Choudhary, V Mishra, A Goswami… - Artificial Intelligence …, 2020 - Springer

In recent years, machine learning (ML) and deep learning (DL) have shown remarkable
improvement in computer vision, natural language processing, stock prediction, forecasting …

被引用次数：391 相关文章所有 8 个版本

[PDF] arxiv.org

Lora: Low-rank adaptation of large language models

EJ Hu, Y Shen, P Wallis, Z Allen-Zhu, Y Li… - arXiv preprint arXiv …, 2021 - arxiv.org

An important paradigm of natural language processing consists of large-scale pre-training
on general domain data and adaptation to particular tasks or domains. As we pre-train larger …

被引用次数：4774 相关文章所有 10 个版本

[PDF] arxiv.org

Ecapa-tdnn: Emphasized channel attention, propagation and aggregation in tdnn based speaker verification

B Desplanques, J Thienpondt, K Demuynck - arXiv preprint arXiv …, 2020 - arxiv.org

Current speaker verification techniques rely on a neural network to extract speaker
representations. The successful x-vector architecture is a Time Delay Neural Network …

被引用次数：1210 相关文章所有 15 个版本

[PDF] arxiv.org

A survey on model compression for large language models

X Zhu, J Li, Y Liu, C Ma, W Wang - arXiv preprint arXiv:2308.07633, 2023 - arxiv.org

Large Language Models (LLMs) have revolutionized natural language processing tasks with
remarkable success. However, their formidable size and computational demands present …

被引用次数：95 相关文章所有 2 个版本

[PDF] arxiv.org

Wenetspeech: A 10000+ hours multi-domain mandarin corpus for speech recognition

B Zhang, H Lv, P Guo, Q Shao, C Yang… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

In this paper, we present WenetSpeech, a multi-domain Mandarin corpus consisting of
10000+ hours high-quality labeled speech, 2400+ hours weakly labeled speech, and about …

被引用次数：141 相关文章所有 4 个版本

[PDF] arxiv.org

CHiME-6 challenge: Tackling multispeaker speech recognition for unsegmented recordings

S Watanabe, M Mandel, J Barker, E Vincent… - arXiv preprint arXiv …, 2020 - arxiv.org

Following the success of the 1st, 2nd, 3rd, 4th and 5th CHiME challenges we organize the
6th CHiME Speech Separation and Recognition Challenge (CHiME-6). The new challenge …

被引用次数：299 相关文章所有 7 个版本

Towards edge computing in intelligent manufacturing: Past, present and future

G Nain, KK Pattanaik, GK Sharma - Journal of Manufacturing Systems, 2022 - Elsevier

Abstract Industry 4.0 (I4. 0) is the fourth industrial revolution and a synonym for intelligent
manufacturing. It drives the convergence of several cutting-edge technologies to provoke …

被引用次数：88 相关文章

[PDF] arxiv.org

Jasper: An end-to-end convolutional neural acoustic model

J Li, V Lavrukhin, B Ginsburg, R Leary… - arXiv preprint arXiv …, 2019 - arxiv.org

In this paper, we report state-of-the-art results on LibriSpeech among end-to-end speech
recognition models without any external training data. Our model, Jasper, uses only 1D …

被引用次数：271 相关文章所有 8 个版本

[PDF] arxiv.org

TED-LIUM 3: Twice as much data and corpus repartition for experiments on speaker adaptation

F Hernandez, V Nguyen, S Ghannay… - Speech and Computer …, 2018 - Springer

In this paper, we present TED-LIUM release 3 corpus (TED-LIUM 3 is available on
https://lium. univ-lemans. fr/ted-lium3/) dedicated to speech recognition in English, which …

被引用次数：315 相关文章所有 8 个版本