[PDF][PDF] Recent advances in end-to-end automatic speech recognition

J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com
Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …

End-to-end speech recognition: A survey

R Prabhavalkar, T Hori, TN Sainath… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org
In the last decade of automatic speech recognition (ASR) research, the introduction of deep
learning has brought considerable reductions in word error rate of more than 50% relative …

Developing real-time streaming transformer transducer for speech recognition on large-scale dataset

X Chen, Y Wu, Z Wang, S Liu… - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
Recently, Transformer based end-to-end models have achieved great success in many
areas including speech recognition. However, compared to LSTM models, the heavy …

Streaming automatic speech recognition with the transformer model

N Moritz, T Hori, J Le - ICASSP 2020-2020 IEEE International …, 2020 - ieeexplore.ieee.org
Encoder-decoder based sequence-to-sequence models have demonstrated state-of-the-art
results in end-to-end automatic speech recognition (ASR). Recently, the transformer …

Improving RNN transducer modeling for end-to-end speech recognition

J Li, R Zhao, H Hu, Y Gong - 2019 IEEE Automatic Speech …, 2019 - ieeexplore.ieee.org
In the last few years, an emerging trend in automatic speech recognition research is the
study of end-to-end (E2E) systems. Connectionist Temporal Classification (CTC), Attention …

Evaluation of neural architectures trained with square loss vs cross-entropy in classification tasks

L Hui, M Belkin - arXiv preprint arXiv:2006.07322, 2020 - arxiv.org
Modern neural architectures for classification tasks are trained using the cross-entropy loss,
which is widely believed to be empirically superior to the square loss. In this work we …

A better and faster end-to-end model for streaming asr

B Li, A Gulati, J Yu, TN Sainath, CC Chiu… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
End-to-end (E2E) models have shown to outperform state-of-the-art conventional models for
streaming speech recognition [1] across many dimensions, including quality (as measured …

[HTML][HTML] Thank you for attention: a survey on attention-based artificial neural networks for automatic speech recognition

P Karmakar, SW Teng, G Lu - Intelligent Systems with Applications, 2024 - Elsevier
Attention is a very popular and effective mechanism in artificial neural network-based
sequence-to-sequence models. In this survey paper, a comprehensive review of the different …

On the comparison of popular end-to-end models for large scale speech recognition

J Li, Y Wu, Y Gaur, C Wang, R Zhao, S Liu - arXiv preprint arXiv …, 2020 - arxiv.org
Recently, there has been a strong push to transition from hybrid models to end-to-end (E2E)
models for automatic speech recognition. Currently, there are three promising E2E methods …

Deep learning model for house price prediction using heterogeneous data analysis along with joint self-attention mechanism

PY Wang, CT Chen, JW Su, TY Wang… - IEEE access, 2021 - ieeexplore.ieee.org
House price prediction is a popular topic, and research teams are increasingly performing
related studies by using deep learning or machine learning models. However, because …