[PDF][PDF] Recent advances in end-to-end automatic speech recognition
J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com
Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …
Enabling resource-efficient aiot system with cross-level optimization: A survey
The emerging field of artificial intelligence of things (AIoT, AI+ IoT) is driven by the
widespread use of intelligent infrastructures and the impressive success of deep learning …
widespread use of intelligent infrastructures and the impressive success of deep learning …
Structured pruning of large language models
Large language models have recently achieved state of the art performance across a wide
variety of natural language tasks. Meanwhile, the size of these models and their latency …
variety of natural language tasks. Meanwhile, the size of these models and their latency …
VoiceFilter-Lite: Streaming targeted voice separation for on-device speech recognition
We introduce VoiceFilter-Lite, a single-channel source separation model that runs on the
device to preserve only the speech signals from a target user, as part of a streaming speech …
device to preserve only the speech signals from a target user, as part of a streaming speech …
Parp: Prune, adjust and re-prune for self-supervised speech recognition
Self-supervised speech representation learning (speech SSL) has demonstrated the benefit
of scale in learning rich representations for Automatic Speech Recognition (ASR) with …
of scale in learning rich representations for Automatic Speech Recognition (ASR) with …
Coarsening the granularity: Towards structurally sparse lottery tickets
The lottery ticket hypothesis (LTH) has shown that dense models contain highly sparse
subnetworks (ie, winning tickets) that can be trained in isolation to match full accuracy …
subnetworks (ie, winning tickets) that can be trained in isolation to match full accuracy …
When attention meets fast recurrence: Training language models with reduced compute
T Lei - arXiv preprint arXiv:2102.12459, 2021 - arxiv.org
Large language models have become increasingly difficult to train because of the growing
computation time and cost. In this work, we present SRU++, a highly-efficient architecture …
computation time and cost. In this work, we present SRU++, a highly-efficient architecture …
Alignment restricted streaming recurrent neural network transducer
There is a growing interest in the speech community in developing Recurrent Neural
Network Transducer (RNN-T) models for automatic speech recognition (ASR) applications …
Network Transducer (RNN-T) models for automatic speech recognition (ASR) applications …
Automatic speech recognition using limited vocabulary: A survey
JLKE Fendji, DCM Tala, BO Yenke… - Applied Artificial …, 2022 - Taylor & Francis
ABSTRACT Automatic Speech Recognition (ASR) is an active field of research due to its
large number of applications and the proliferation of interfaces or computing devices that …
large number of applications and the proliferation of interfaces or computing devices that …
CHIMERA: A 0.92-TOPS, 2.2-TOPS/W edge AI accelerator with 2-MByte on-chip foundry resistive RAM for efficient training and inference
Implementing edge artificial intelligence (AI) inference and training is challenging with
current memory technologies. As deep neural networks (DNNs) grow in size, this problem is …
current memory technologies. As deep neural networks (DNNs) grow in size, this problem is …