Efficient minimum word error rate training of rnn-transducer for end-to-end speech recognition

J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com

Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …

被引用次数：398 相关文章所有 7 个版本

[PDF] arxiv.org

Enabling resource-efficient aiot system with cross-level optimization: A survey

S Liu, B Guo, C Fang, Z Wang, S Luo… - … Surveys & Tutorials, 2023 - ieeexplore.ieee.org

The emerging field of artificial intelligence of things (AIoT, AI+ IoT) is driven by the
widespread use of intelligent infrastructures and the impressive success of deep learning …

被引用次数：25 相关文章所有 6 个版本

[PDF] arxiv.org

Joist: A joint speech and text streaming model for asr

TN Sainath, R Prabhavalkar, A Bapna… - 2022 IEEE Spoken …, 2023 - ieeexplore.ieee.org

We present JOIST, an algorithm to train a streaming, cascaded, encoder end-to-end (E2E)
model with both speech-text paired inputs, and text-only unpaired inputs. Unlike previous …

被引用次数：33 相关文章所有 3 个版本

[PDF] arxiv.org

Tied & reduced rnn-t decoder

R Botros, TN Sainath, R David, E Guzman, W Li… - arXiv preprint arXiv …, 2021 - arxiv.org

Previous works on the Recurrent Neural Network-Transducer (RNN-T) models have shown
that, under some conditions, it is possible to simplify its prediction network with little or no …

被引用次数：59 相关文章所有 5 个版本

[PDF] mdpi.com

Electrical energy prediction in residential buildings for short-term horizons using hybrid deep learning strategy

ZA Khan, A Ullah, W Ullah, S Rho, M Lee, SW Baik - Applied Sciences, 2020 - mdpi.com

Smart grid technology based on renewable energy and energy storage systems are
attracting considerable attention towards energy crises. Accurate and reliable model for …

被引用次数：76 相关文章所有 9 个版本

[PDF] arxiv.org

Wav2vec-c: A self-supervised model for speech representation learning

S Sadhu, D He, CW Huang, SH Mallidi, M Wu… - arXiv preprint arXiv …, 2021 - arxiv.org

Wav2vec-C introduces a novel representation learning technique combining elements from
wav2vec 2.0 and VQ-VAE. Our model learns to reproduce quantized representations from …

被引用次数：63 相关文章所有 6 个版本

ASRTest: automated testing for deep-neural-network-driven speech recognition systems

P Ji, Y Feng, J Liu, Z Zhao, Z Chen - Proceedings of the 31st ACM …, 2022 - dl.acm.org

With the rapid development of deep neural networks and end-to-end learning techniques,
automatic speech recognition (ASR) systems have been deployed into our daily and assist …

被引用次数：19 相关文章

[PDF] arxiv.org

Personalization strategies for end-to-end speech recognition systems

A Gourav, L Liu, A Gandhe, Y Gu, G Lan… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org

The recognition of personalized content, such as contact names, remains a challenging
problem for end-to-end speech recognition systems. In this work, we demonstrate how first …

被引用次数：38 相关文章所有 7 个版本

[PDF] arxiv.org

Less is more: Improved rnn-t decoding using limited label context and path merging

R Prabhavalkar, Y He, D Rybach… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org

End-to-end models that condition the output sequence on all previously predicted labels
have emerged as popular alternatives to conventional systems for automatic speech …

被引用次数：36 相关文章所有 5 个版本

[PDF] arxiv.org

Efficient training of neural transducer for speech recognition

W Zhou, W Michel, R Schlüter, H Ney - arXiv preprint arXiv:2204.10586, 2022 - arxiv.org

As one of the most popular sequence-to-sequence modeling approaches for speech
recognition, the RNN-Transducer has achieved evolving performance with more and more …

被引用次数：20 相关文章所有 9 个版本