[PDF][PDF] Recent advances in end-to-end automatic speech recognition
J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com
Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …
Enabling resource-efficient aiot system with cross-level optimization: A survey
The emerging field of artificial intelligence of things (AIoT, AI+ IoT) is driven by the
widespread use of intelligent infrastructures and the impressive success of deep learning …
widespread use of intelligent infrastructures and the impressive success of deep learning …
Joist: A joint speech and text streaming model for asr
We present JOIST, an algorithm to train a streaming, cascaded, encoder end-to-end (E2E)
model with both speech-text paired inputs, and text-only unpaired inputs. Unlike previous …
model with both speech-text paired inputs, and text-only unpaired inputs. Unlike previous …
Tied & reduced rnn-t decoder
Previous works on the Recurrent Neural Network-Transducer (RNN-T) models have shown
that, under some conditions, it is possible to simplify its prediction network with little or no …
that, under some conditions, it is possible to simplify its prediction network with little or no …
Electrical energy prediction in residential buildings for short-term horizons using hybrid deep learning strategy
Smart grid technology based on renewable energy and energy storage systems are
attracting considerable attention towards energy crises. Accurate and reliable model for …
attracting considerable attention towards energy crises. Accurate and reliable model for …
Wav2vec-c: A self-supervised model for speech representation learning
Wav2vec-C introduces a novel representation learning technique combining elements from
wav2vec 2.0 and VQ-VAE. Our model learns to reproduce quantized representations from …
wav2vec 2.0 and VQ-VAE. Our model learns to reproduce quantized representations from …
ASRTest: automated testing for deep-neural-network-driven speech recognition systems
With the rapid development of deep neural networks and end-to-end learning techniques,
automatic speech recognition (ASR) systems have been deployed into our daily and assist …
automatic speech recognition (ASR) systems have been deployed into our daily and assist …
Personalization strategies for end-to-end speech recognition systems
The recognition of personalized content, such as contact names, remains a challenging
problem for end-to-end speech recognition systems. In this work, we demonstrate how first …
problem for end-to-end speech recognition systems. In this work, we demonstrate how first …
Less is more: Improved rnn-t decoding using limited label context and path merging
R Prabhavalkar, Y He, D Rybach… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
End-to-end models that condition the output sequence on all previously predicted labels
have emerged as popular alternatives to conventional systems for automatic speech …
have emerged as popular alternatives to conventional systems for automatic speech …
Efficient training of neural transducer for speech recognition
As one of the most popular sequence-to-sequence modeling approaches for speech
recognition, the RNN-Transducer has achieved evolving performance with more and more …
recognition, the RNN-Transducer has achieved evolving performance with more and more …