Optimizing speech recognition for the edge

J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com

Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …

被引用次数：325 相关文章所有 7 个版本

[PDF] arxiv.org

Enabling resource-efficient aiot system with cross-level optimization: A survey

S Liu, B Guo, C Fang, Z Wang, S Luo… - … Surveys & Tutorials, 2023 - ieeexplore.ieee.org

The emerging field of artificial intelligence of things (AIoT, AI+ IoT) is driven by the
widespread use of intelligent infrastructures and the impressive success of deep learning …

被引用次数：8 相关文章所有 6 个版本

[PDF] arxiv.org

Structured pruning of large language models

Z Wang, J Wohlwend, T Lei - arXiv preprint arXiv:1910.04732, 2019 - arxiv.org

Large language models have recently achieved state of the art performance across a wide
variety of natural language tasks. Meanwhile, the size of these models and their latency …

被引用次数：226 相关文章所有 3 个版本

[PDF] arxiv.org

VoiceFilter-Lite: Streaming targeted voice separation for on-device speech recognition

Q Wang, IL Moreno, M Saglam, K Wilson… - arXiv preprint arXiv …, 2020 - arxiv.org

We introduce VoiceFilter-Lite, a single-channel source separation model that runs on the
device to preserve only the speech signals from a target user, as part of a streaming speech …

被引用次数：94 相关文章所有 11 个版本

[PDF] neurips.cc

Parp: Prune, adjust and re-prune for self-supervised speech recognition

CIJ Lai, Y Zhang, AH Liu, S Chang… - Advances in …, 2021 - proceedings.neurips.cc

Self-supervised speech representation learning (speech SSL) has demonstrated the benefit
of scale in learning rich representations for Automatic Speech Recognition (ASR) with …

被引用次数：60 相关文章所有 9 个版本

[PDF] mlr.press

Coarsening the granularity: Towards structurally sparse lottery tickets

T Chen, X Chen, X Ma, Y Wang… - … conference on machine …, 2022 - proceedings.mlr.press

The lottery ticket hypothesis (LTH) has shown that dense models contain highly sparse
subnetworks (ie, winning tickets) that can be trained in isolation to match full accuracy …

被引用次数：35 相关文章所有 6 个版本

[PDF] arxiv.org

When attention meets fast recurrence: Training language models with reduced compute

T Lei - arXiv preprint arXiv:2102.12459, 2021 - arxiv.org

Large language models have become increasingly difficult to train because of the growing
computation time and cost. In this work, we present SRU++, a highly-efficient architecture …

被引用次数：61 相关文章所有 5 个版本

[PDF] arxiv.org

Alignment restricted streaming recurrent neural network transducer

J Mahadeokar, Y Shangguan, D Le… - 2021 IEEE Spoken …, 2021 - ieeexplore.ieee.org

There is a growing interest in the speech community in developing Recurrent Neural
Network Transducer (RNN-T) models for automatic speech recognition (ASR) applications …

被引用次数：67 相关文章所有 4 个版本

[PDF] tandfonline.com

Automatic speech recognition using limited vocabulary: A survey

JLKE Fendji, DCM Tala, BO Yenke… - Applied Artificial …, 2022 - Taylor & Francis

ABSTRACT Automatic Speech Recognition (ASR) is an active field of research due to its
large number of applications and the proliferation of interfaces or computing devices that …

被引用次数：37 相关文章所有 8 个版本

[PDF] ieee.org

CHIMERA: A 0.92-TOPS, 2.2-TOPS/W edge AI accelerator with 2-MByte on-chip foundry resistive RAM for efficient training and inference

K Prabhu, A Gural, ZF Khan… - IEEE Journal of Solid …, 2022 - ieeexplore.ieee.org

Implementing edge artificial intelligence (AI) inference and training is challenging with
current memory technologies. As deep neural networks (DNNs) grow in size, this problem is …

被引用次数：31 相关文章所有 3 个版本