[PDF][PDF] Recent advances in end-to-end automatic speech recognition
J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com
Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …
A metaverse: Taxonomy, components, applications, and open challenges
SM Park, YG Kim - IEEE access, 2022 - ieeexplore.ieee.org
Unlike previous studies on the Metaverse based on Second Life, the current Metaverse is
based on the social value of Generation Z that online and offline selves are not different …
based on the social value of Generation Z that online and offline selves are not different …
Contextual adapters for personalized speech recognition in neural transducers
Personal rare word recognition in end-to-end Automatic Speech Recognition (E2E ASR)
models is a challenge due to the lack of training data. A standard way to address this issue …
models is a challenge due to the lack of training data. A standard way to address this issue …
Contextualized streaming end-to-end speech recognition with trie-based deep biasing and shallow fusion
How to leverage dynamic contextual information in end-to-end speech recognition has
remained an active research area. Previous solutions to this problem were either designed …
remained an active research area. Previous solutions to this problem were either designed …
Improving end-to-end contextual speech recognition with fine-grained contextual knowledge selection
Nowadays, most methods for end-to-end contextual speech recognition bias the recognition
process towards contextual knowledge. Since all-neural contextual biasing methods rely on …
process towards contextual knowledge. Since all-neural contextual biasing methods rely on …
Personalization of ctc speech recognition models
End-to-end speech recognition models trained using joint Connectionist Temporal
Classification (CTC)-Attention loss have gained popularity recently. In these models, a non …
Classification (CTC)-Attention loss have gained popularity recently. In these models, a non …
End-to-end speech recognition contextualization with large language models
In recent years, Large Language Models (LLMs) have garnered significant attention from the
research community due to their exceptional performance and generalization capabilities. In …
research community due to their exceptional performance and generalization capabilities. In …
Tree-constrained pointer generator for end-to-end contextual speech recognition
Contextual knowledge is important for real-world automatic speech recognition (ASR)
applications. In this paper, a novel tree-constrained pointer generator (TCPGen) component …
applications. In this paper, a novel tree-constrained pointer generator (TCPGen) component …
Can contextual biasing remain effective with Whisper and GPT-2?
End-to-end automatic speech recognition (ASR) and large language models, such as
Whisper and GPT-2, have recently been scaled to use vast amounts of training data. Despite …
Whisper and GPT-2, have recently been scaled to use vast amounts of training data. Despite …
Towards contextual spelling correction for customization of end-to-end speech recognition systems
Contextual biasing is an important and challenging task for end-to-end automatic speech
recognition (ASR) systems, which aims to achieve better recognition performance by biasing …
recognition (ASR) systems, which aims to achieve better recognition performance by biasing …