Unsupervised speaker adaptation using attention-based speaker memory for end-to-end ASR

J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com

Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …

被引用次数：338 相关文章所有 7 个版本

[PDF] ieee.org

Adaptation algorithms for neural network-based speech recognition: An overview

P Bell, J Fainberg, O Klejch, J Li… - IEEE Open Journal …, 2020 - ieeexplore.ieee.org

We present a structured overview of adaptation algorithms for neural network-based speech
recognition, considering both hybrid hidden Markov model/neural network systems and end …

被引用次数：90 相关文章所有 7 个版本

[PDF] ieee.org

End-to-end speech recognition: A survey

R Prabhavalkar, T Hori, TN Sainath… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org

In the last decade of automatic speech recognition (ASR) research, the introduction of deep
learning has brought considerable reductions in word error rate of more than 50% relative …

被引用次数：88 相关文章所有 6 个版本

[PDF] arxiv.org

Advanced long-context end-to-end speech recognition using context-expanded transformers

T Hori, N Moritz, C Hori, JL Roux - arXiv preprint arXiv:2104.09426, 2021 - arxiv.org

This paper addresses end-to-end automatic speech recognition (ASR) for long audio
recordings such as lecture and conversational speeches. Most end-to-end ASR models are …

被引用次数：36 相关文章所有 6 个版本

[PDF] arxiv.org

Confidence score based speaker adaptation of conformer speech recognition systems

J Deng, X Xie, T Wang, M Cui, B Xue… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org

Speaker adaptation techniques provide a powerful solution to customise automatic speech
recognition (ASR) systems for individual users. Practical application of unsupervised model …

被引用次数：8 相关文章所有 5 个版本

[HTML] sciencedirect.com

[HTML][HTML] An experimental review of speaker diarization methods with application to two-speaker conversational telephone speech recordings

L Serafini, S Cornell, G Morrone, E Zovato… - Computer Speech & …, 2023 - Elsevier

We performed an experimental review of current diarization systems for the conversational
telephone speech (CTS) domain. In detail, we considered a total of eight different algorithms …

被引用次数：7 相关文章所有 6 个版本

[PDF] mdpi.com

Attention-inspired artificial neural networks for speech processing: A systematic review

N Zacarias-Morales, P Pancardo… - Symmetry, 2021 - mdpi.com

Artificial Neural Networks (ANNs) were created inspired by the neural networks in the
human brain and have been widely applied in speech processing. The application areas of …

被引用次数：23 相关文章所有 8 个版本

[PDF] isca-archive.org

[PDF][PDF] Rapid Speaker Adaptation for Conformer Transducer: Attention and Bias Are All You Need.

Y Huang, G Ye, J Li, Y Gong - Interspeech, 2021 - isca-archive.org

Conformer transducer achieves new state-of-the-art end-to-end (E2E) system performance
and has become increasingly appealing for production. In this paper, we study how to …

被引用次数：17 相关文章所有 4 个版本

[PDF] isca-archive.org

[PDF][PDF] Transformer-Based Long-Context End-to-End Speech Recognition.

T Hori, N Moritz, C Hori, J Le Roux - Interspeech, 2020 - isca-archive.org

This paper presents an approach to long-context end-to-end automatic speech recognition
(ASR) using Transformers, aiming at improving ASR accuracy for long audio recordings …

被引用次数：38 相关文章所有 10 个版本

[PDF] umass.edu

AdaStreamLite: Environment-adaptive Streaming Speech Recognition on Mobile Devices

Y Wei, J Xiong, H Liu, Y Yu, J Pan, J Du - Proceedings of the ACM on …, 2024 - dl.acm.org

Streaming speech recognition aims to transcribe speech to text in a streaming manner,
providing real-time speech interaction for smartphone users. However, it is not trivial to …

被引用次数：1 相关文章所有 2 个版本