Weak-attention suppression for transformer based speech recognition

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier

The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

被引用次数：100 相关文章所有 6 个版本

[PDF] arxiv.org

Emformer: Efficient memory transformer based acoustic model for low latency streaming speech recognition

Y Shi, Y Wang, C Wu, CF Yeh, J Chan… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org

This paper proposes an efficient memory transformer Emformer for low latency streaming
speech recognition. In Emformer, the long-range history context is distilled into an …

被引用次数：162 相关文章所有 3 个版本

[HTML] sciencedirect.com

[HTML][HTML] Thank you for attention: a survey on attention-based artificial neural networks for automatic speech recognition

P Karmakar, SW Teng, G Lu - Intelligent Systems with Applications, 2024 - Elsevier

Attention is a very popular and effective mechanism in artificial neural network-based
sequence-to-sequence models. In this survey paper, a comprehensive review of the different …

被引用次数：33 相关文章所有 3 个版本

[HTML] nature.com

[HTML][HTML] A study of transformer-based end-to-end speech recognition system for Kazakh language

M Orken, O Dina, A Keylan, T Tolganay, O Mohamed - Scientific reports, 2022 - nature.com

Today, the Transformer model, which allows parallelization and also has its own internal
attention, has been widely used in the field of speech recognition. The great advantage of …

被引用次数：30 相关文章所有 7 个版本

[PDF] openreview.net

Understanding the role of self attention for efficient speech recognition

K Shim, J Choi, W Sung - International Conference on Learning …, 2022 - openreview.net

Self-attention (SA) is a critical component of Transformer neural networks that have
succeeded in automatic speech recognition (ASR). In this paper, we analyze the role of SA …

被引用次数：38 相关文章所有 2 个版本

[PDF] arxiv.org

Improving end-to-end contextual speech recognition with fine-grained contextual knowledge selection

M Han, L Dong, Z Liang, M Cai, S Zhou… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

Nowadays, most methods for end-to-end contextual speech recognition bias the recognition
process towards contextual knowledge. Since all-neural contextual biasing methods rely on …

被引用次数：34 相关文章所有 3 个版本

[PDF] arxiv.org

Streaming transformer-based acoustic models using self-attention with augmented memory

C Wu, Y Wang, Y Shi, CF Yeh, F Zhang - arXiv preprint arXiv:2005.08042, 2020 - arxiv.org

Transformer-based acoustic modeling has achieved great suc-cess for both hybrid and
sequence-to-sequence speech recogni-tion. However, it requires access to the full …

被引用次数：67 相关文章所有 6 个版本

[PDF] arxiv.org

Privacy-preserving speech emotion recognition through semi-supervised federated learning

V Tsouvalas, T Ozcelebi… - 2022 IEEE International …, 2022 - ieeexplore.ieee.org

Speech Emotion Recognition (SER) refers to the recognition of human emotions from
natural speech. If done accurately, it can offer a number of benefits in building human …

被引用次数：25 相关文章所有 5 个版本

[PDF] archive.org

Transformer 在语音识别任务中的研究现状与展望.

张晓旭，马志强，刘志强，朱方圆… - Journal of Frontiers of …, 2021 - search.ebscohost.com

Transformer 作为一种新的深度学习算法框架, 得到了越来越多研究人员的关注,
成为目前的研究热点. Transformer 模型中的自注意力机制受人类只关注于重要事物的启发 …

被引用次数：10 相关文章所有 4 个版本

[PDF] arxiv.org

Tiny transformers for environmental sound classification at the edge

D Elliott, CE Otero, S Wyatt, E Martino - arXiv preprint arXiv:2103.12157, 2021 - arxiv.org

With the growth of the Internet of Things and the rise of Big Data, data processing and
machine learning applications are being moved to cheap and low size, weight, and power …

被引用次数：24 相关文章所有 2 个版本