Improving transformer-based end-to-end speech recognition with connectionist temporal classificat...

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier

The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

被引用次数：118 相关文章所有 6 个版本

[PDF] arxiv.org

A complete survey on generative ai (aigc): Is chatgpt from gpt-4 to gpt-5 all you need?

C Zhang, C Zhang, S Zheng, Y Qiao, C Li… - arXiv preprint arXiv …, 2023 - arxiv.org

As ChatGPT goes viral, generative AI (AIGC, aka AI-generated content) has made headlines
everywhere because of its ability to analyze and create text, images, and beyond. With such …

被引用次数：165 相关文章所有 4 个版本

[PDF] nowpublishers.com

[PDF][PDF] Recent advances in end-to-end automatic speech recognition

J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com

Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …

被引用次数：338 相关文章所有 7 个版本

[PDF] arxiv.org

A review of sparse expert models in deep learning

W Fedus, J Dean, B Zoph - arXiv preprint arXiv:2209.01667, 2022 - arxiv.org

Sparse expert models are a thirty-year old concept re-emerging as a popular architecture in
deep learning. This class of architecture encompasses Mixture-of-Experts, Switch …

被引用次数：95 相关文章所有 2 个版本

[PDF] arxiv.org

A comparative study on transformer vs rnn in speech applications

S Karita, N Chen, T Hayashi, T Hori… - 2019 IEEE automatic …, 2019 - ieeexplore.ieee.org

Sequence-to-sequence models have been widely used in end-to-end speech processing,
for example, automatic speech recognition (ASR), speech translation (ST), and text-to …

被引用次数：817 相关文章所有 10 个版本

[PDF] arxiv.org

Recent developments on espnet toolkit boosted by conformer

P Guo, F Boyer, X Chang, T Hayashi… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org

In this study, we present recent developments on ESPnet: End-to-End Speech Processing
toolkit, which mainly involves a recently proposed architecture called Conformer …

被引用次数：284 相关文章所有 8 个版本

[PDF] ieee.org

RTIDS: A robust transformer-based approach for intrusion detection system

Z Wu, H Zhang, P Wang, Z Sun - IEEE Access, 2022 - ieeexplore.ieee.org

Due to the rapid growth in network traffic and increasing security threats, Intrusion Detection
Systems (IDS) have become increasingly critical in the field of cyber security for providing …

被引用次数：95 相关文章所有 2 个版本

[PDF] arxiv.org

Deep transfer learning for automatic speech recognition: Towards better generalization

H Kheddar, Y Himeur, S Al-Maadeed, A Amira… - Knowledge-Based …, 2023 - Elsevier

Automatic speech recognition (ASR) has recently become an important challenge when
using deep learning (DL). It requires large-scale training datasets and high computational …

被引用次数：46 相关文章所有 5 个版本

[PDF] arxiv.org

Streaming automatic speech recognition with the transformer model

N Moritz, T Hori, J Le - ICASSP 2020-2020 IEEE International …, 2020 - ieeexplore.ieee.org

Encoder-decoder based sequence-to-sequence models have demonstrated state-of-the-art
results in end-to-end automatic speech recognition (ASR). Recently, the transformer …

被引用次数：212 相关文章所有 11 个版本

[PDF] arxiv.org

Intermediate loss regularization for ctc-based speech recognition

J Lee, S Watanabe - ICASSP 2021-2021 IEEE International …, 2021 - ieeexplore.ieee.org

We present a simple and efficient auxiliary loss function for automatic speech recognition
(ASR) based on the connectionist temporal classification (CTC) objective. The proposed …

被引用次数：130 相关文章所有 5 个版本