An overview of noise-robust automatic speech recognition

J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com

Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …

被引用次数：326 相关文章所有 7 个版本

[PDF] ieee.org

Speech recognition using deep neural networks: A systematic review

AB Nassif, I Shahin, I Attili, M Azzeh, K Shaalan - IEEE access, 2019 - ieeexplore.ieee.org

Over the past decades, a tremendous amount of research has been done on the use of
machine learning for speech processing applications, especially speech recognition …

被引用次数：1185 相关文章所有 9 个版本

[PDF] arxiv.org

Conditional diffusion probabilistic model for speech enhancement

YJ Lu, ZQ Wang, S Watanabe… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

Speech enhancement is a critical component of many user-oriented audio applications, yet
current systems still suffer from distorted and unnatural outputs. While generative models …

被引用次数：126 相关文章所有 7 个版本

[PDF] researchgate.net

Cooperative heterogeneous multi-robot systems: A survey

Y Rizk, M Awad, EW Tunstel - ACM Computing Surveys (CSUR), 2019 - dl.acm.org

The emergence of the Internet of things and the widespread deployment of diverse
computing systems have led to the formation of heterogeneous multi-agent systems (MAS) …

被引用次数：371 相关文章所有 3 个版本

[PDF] neurips.cc

Hyporadise: An open baseline for generative speech recognition with large language models

C Chen, Y Hu, CHH Yang… - Advances in …, 2024 - proceedings.neurips.cc

Advancements in deep neural networks have allowed automatic speech recognition (ASR)
systems to attain human parity on several publicly available clean speech datasets …

被引用次数：17 相关文章所有 9 个版本

[PDF] arxiv.org

Light gated recurrent units for speech recognition

M Ravanelli, P Brakel, M Omologo… - IEEE Transactions on …, 2018 - ieeexplore.ieee.org

A field that has directly benefited from the recent advances in deep learning is automatic
speech recognition (ASR). Despite the great achievements of the past decades, however, a …

被引用次数：402 相关文章所有 7 个版本

[PDF] arxiv.org

Cold diffusion for speech enhancement

H Yen, FG Germain, G Wichern… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org

Diffusion models have recently shown promising results for difficult enhancement tasks such
as the conditional and unconditional restoration of natural images and audio signals. In this …

被引用次数：76 相关文章所有 9 个版本

[PDF] mlr.press

Unispeech: Unified speech representation learning with labeled and unlabeled data

C Wang, Y Wu, Y Qian, K Kumatani… - International …, 2021 - proceedings.mlr.press

In this paper, we propose a unified pre-training approach called UniSpeech to learn speech
representations with both labeled and unlabeled data, in which supervised phonetic CTC …

被引用次数：117 相关文章所有 4 个版本

[PDF] arxiv.org

Deep learning for environmentally robust speech recognition: An overview of recent developments

Z Zhang, J Geiger, J Pohjalainen, AED Mousa… - ACM Transactions on …, 2018 - dl.acm.org

Eliminating the negative effect of non-stationary environmental noise is a long-standing
research topic for automatic speech recognition but still remains an important challenge …

被引用次数：392 相关文章所有 10 个版本

[HTML] springer.com Full View

[HTML][HTML] An analytical study of information extraction from unstructured and multidimensional big data

K Adnan, R Akbar - Journal of Big Data, 2019 - Springer

Process of information extraction (IE) is used to extract useful information from unstructured
or semi-structured data. Big data arise new challenges for IE techniques with the rapid …

被引用次数：209 相关文章所有 10 个版本