[PDF][PDF] Recent advances in end-to-end automatic speech recognition

J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com
Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …

Speech recognition using deep neural networks: A systematic review

AB Nassif, I Shahin, I Attili, M Azzeh, K Shaalan - IEEE access, 2019 - ieeexplore.ieee.org
Over the past decades, a tremendous amount of research has been done on the use of
machine learning for speech processing applications, especially speech recognition …

Conditional diffusion probabilistic model for speech enhancement

YJ Lu, ZQ Wang, S Watanabe… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
Speech enhancement is a critical component of many user-oriented audio applications, yet
current systems still suffer from distorted and unnatural outputs. While generative models …

Cooperative heterogeneous multi-robot systems: A survey

Y Rizk, M Awad, EW Tunstel - ACM Computing Surveys (CSUR), 2019 - dl.acm.org
The emergence of the Internet of things and the widespread deployment of diverse
computing systems have led to the formation of heterogeneous multi-agent systems (MAS) …

Hyporadise: An open baseline for generative speech recognition with large language models

C Chen, Y Hu, CHH Yang… - Advances in …, 2024 - proceedings.neurips.cc
Advancements in deep neural networks have allowed automatic speech recognition (ASR)
systems to attain human parity on several publicly available clean speech datasets …

Light gated recurrent units for speech recognition

M Ravanelli, P Brakel, M Omologo… - IEEE Transactions on …, 2018 - ieeexplore.ieee.org
A field that has directly benefited from the recent advances in deep learning is automatic
speech recognition (ASR). Despite the great achievements of the past decades, however, a …

Cold diffusion for speech enhancement

H Yen, FG Germain, G Wichern… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
Diffusion models have recently shown promising results for difficult enhancement tasks such
as the conditional and unconditional restoration of natural images and audio signals. In this …

Unispeech: Unified speech representation learning with labeled and unlabeled data

C Wang, Y Wu, Y Qian, K Kumatani… - International …, 2021 - proceedings.mlr.press
In this paper, we propose a unified pre-training approach called UniSpeech to learn speech
representations with both labeled and unlabeled data, in which supervised phonetic CTC …

Deep learning for environmentally robust speech recognition: An overview of recent developments

Z Zhang, J Geiger, J Pohjalainen, AED Mousa… - ACM Transactions on …, 2018 - dl.acm.org
Eliminating the negative effect of non-stationary environmental noise is a long-standing
research topic for automatic speech recognition but still remains an important challenge …

[HTML][HTML] An analytical study of information extraction from unstructured and multidimensional big data

K Adnan, R Akbar - Journal of Big Data, 2019 - Springer
Process of information extraction (IE) is used to extract useful information from unstructured
or semi-structured data. Big data arise new challenges for IE techniques with the rapid …