[PDF][PDF] Recent advances in end-to-end automatic speech recognition
J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com
Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …
Adaptation algorithms for neural network-based speech recognition: An overview
We present a structured overview of adaptation algorithms for neural network-based speech
recognition, considering both hybrid hidden Markov model/neural network systems and end …
recognition, considering both hybrid hidden Markov model/neural network systems and end …
End-to-end speech recognition: A survey
In the last decade of automatic speech recognition (ASR) research, the introduction of deep
learning has brought considerable reductions in word error rate of more than 50% relative …
learning has brought considerable reductions in word error rate of more than 50% relative …
Advanced long-context end-to-end speech recognition using context-expanded transformers
This paper addresses end-to-end automatic speech recognition (ASR) for long audio
recordings such as lecture and conversational speeches. Most end-to-end ASR models are …
recordings such as lecture and conversational speeches. Most end-to-end ASR models are …
Confidence score based speaker adaptation of conformer speech recognition systems
Speaker adaptation techniques provide a powerful solution to customise automatic speech
recognition (ASR) systems for individual users. Practical application of unsupervised model …
recognition (ASR) systems for individual users. Practical application of unsupervised model …
[HTML][HTML] An experimental review of speaker diarization methods with application to two-speaker conversational telephone speech recordings
We performed an experimental review of current diarization systems for the conversational
telephone speech (CTS) domain. In detail, we considered a total of eight different algorithms …
telephone speech (CTS) domain. In detail, we considered a total of eight different algorithms …
Attention-inspired artificial neural networks for speech processing: A systematic review
N Zacarias-Morales, P Pancardo… - Symmetry, 2021 - mdpi.com
Artificial Neural Networks (ANNs) were created inspired by the neural networks in the
human brain and have been widely applied in speech processing. The application areas of …
human brain and have been widely applied in speech processing. The application areas of …
[PDF][PDF] Rapid Speaker Adaptation for Conformer Transducer: Attention and Bias Are All You Need.
Conformer transducer achieves new state-of-the-art end-to-end (E2E) system performance
and has become increasingly appealing for production. In this paper, we study how to …
and has become increasingly appealing for production. In this paper, we study how to …
[PDF][PDF] Transformer-Based Long-Context End-to-End Speech Recognition.
This paper presents an approach to long-context end-to-end automatic speech recognition
(ASR) using Transformers, aiming at improving ASR accuracy for long audio recordings …
(ASR) using Transformers, aiming at improving ASR accuracy for long audio recordings …
AdaStreamLite: Environment-adaptive Streaming Speech Recognition on Mobile Devices
Y Wei, J Xiong, H Liu, Y Yu, J Pan, J Du - Proceedings of the ACM on …, 2024 - dl.acm.org
Streaming speech recognition aims to transcribe speech to text in a streaming manner,
providing real-time speech interaction for smartphone users. However, it is not trivial to …
providing real-time speech interaction for smartphone users. However, it is not trivial to …