A review of deep learning techniques for speech processing

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier
The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

A complete survey on generative ai (aigc): Is chatgpt from gpt-4 to gpt-5 all you need?

C Zhang, C Zhang, S Zheng, Y Qiao, C Li… - arXiv preprint arXiv …, 2023 - arxiv.org
As ChatGPT goes viral, generative AI (AIGC, aka AI-generated content) has made headlines
everywhere because of its ability to analyze and create text, images, and beyond. With such …

[PDF][PDF] Recent advances in end-to-end automatic speech recognition

J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com
Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …

A review of sparse expert models in deep learning

W Fedus, J Dean, B Zoph - arXiv preprint arXiv:2209.01667, 2022 - arxiv.org
Sparse expert models are a thirty-year old concept re-emerging as a popular architecture in
deep learning. This class of architecture encompasses Mixture-of-Experts, Switch …

A comparative study on transformer vs rnn in speech applications

S Karita, N Chen, T Hayashi, T Hori… - 2019 IEEE automatic …, 2019 - ieeexplore.ieee.org
Sequence-to-sequence models have been widely used in end-to-end speech processing,
for example, automatic speech recognition (ASR), speech translation (ST), and text-to …

Recent developments on espnet toolkit boosted by conformer

P Guo, F Boyer, X Chang, T Hayashi… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
In this study, we present recent developments on ESPnet: End-to-End Speech Processing
toolkit, which mainly involves a recently proposed architecture called Conformer …

RTIDS: A robust transformer-based approach for intrusion detection system

Z Wu, H Zhang, P Wang, Z Sun - IEEE Access, 2022 - ieeexplore.ieee.org
Due to the rapid growth in network traffic and increasing security threats, Intrusion Detection
Systems (IDS) have become increasingly critical in the field of cyber security for providing …

Deep transfer learning for automatic speech recognition: Towards better generalization

H Kheddar, Y Himeur, S Al-Maadeed, A Amira… - Knowledge-Based …, 2023 - Elsevier
Automatic speech recognition (ASR) has recently become an important challenge when
using deep learning (DL). It requires large-scale training datasets and high computational …

Streaming automatic speech recognition with the transformer model

N Moritz, T Hori, J Le - ICASSP 2020-2020 IEEE International …, 2020 - ieeexplore.ieee.org
Encoder-decoder based sequence-to-sequence models have demonstrated state-of-the-art
results in end-to-end automatic speech recognition (ASR). Recently, the transformer …

Intermediate loss regularization for ctc-based speech recognition

J Lee, S Watanabe - ICASSP 2021-2021 IEEE International …, 2021 - ieeexplore.ieee.org
We present a simple and efficient auxiliary loss function for automatic speech recognition
(ASR) based on the connectionist temporal classification (CTC) objective. The proposed …