A review of deep learning techniques for speech processing
The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …
learning. The use of multiple processing layers has enabled the creation of models capable …
[PDF][PDF] Recent advances in end-to-end automatic speech recognition
J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com
Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …
A review on the attention mechanism of deep learning
Attention has arguably become one of the most important concepts in the deep learning
field. It is inspired by the biological systems of humans that tend to focus on the distinctive …
field. It is inspired by the biological systems of humans that tend to focus on the distinctive …
End-to-end speech recognition: A survey
In the last decade of automatic speech recognition (ASR) research, the introduction of deep
learning has brought considerable reductions in word error rate of more than 50% relative …
learning has brought considerable reductions in word error rate of more than 50% relative …
Stgat: Modeling spatial-temporal interactions for human trajectory prediction
Human trajectory prediction is challenging and critical in various applications (eg,
autonomous vehicles and social robots). Because of the continuity and foresight of the …
autonomous vehicles and social robots). Because of the continuity and foresight of the …
Wenet: Production oriented streaming and non-streaming end-to-end speech recognition toolkit
In this paper, we propose an open source, production first, and production ready speech
recognition toolkit called WeNet in which a new two-pass approach is implemented to unify …
recognition toolkit called WeNet in which a new two-pass approach is implemented to unify …
Deep learning for audio signal processing
Given the recent surge in developments of deep learning, this paper provides a review of the
state-of-the-art deep learning techniques for audio signal processing. Speech, music, and …
state-of-the-art deep learning techniques for audio signal processing. Speech, music, and …
State-of-the-art speech recognition with sequence-to-sequence models
Attention-based encoder-decoder architectures such as Listen, Attend, and Spell (LAS),
subsume the acoustic, pronunciation and language model components of a traditional …
subsume the acoustic, pronunciation and language model components of a traditional …
Developing real-time streaming transformer transducer for speech recognition on large-scale dataset
Recently, Transformer based end-to-end models have achieved great success in many
areas including speech recognition. However, compared to LSTM models, the heavy …
areas including speech recognition. However, compared to LSTM models, the heavy …
Artificial intelligence in clinical and genomic diagnostics
R Dias, A Torkamani - Genome medicine, 2019 - Springer
Artificial intelligence (AI) is the development of computer systems that are able to perform
tasks that normally require human intelligence. Advances in AI software and hardware …
tasks that normally require human intelligence. Advances in AI software and hardware …