[HTML][HTML] Spoken instruction understanding in air traffic control: Challenge, technique, and application
Y Lin - Aerospace, 2021 - mdpi.com
In air traffic control (ATC), speech communication with radio transmission is the primary way
to exchange information between the controller and aircrew. A wealth of contextual …
to exchange information between the controller and aircrew. A wealth of contextual …
A comparative study on transformer vs rnn in speech applications
Sequence-to-sequence models have been widely used in end-to-end speech processing,
for example, automatic speech recognition (ASR), speech translation (ST), and text-to …
for example, automatic speech recognition (ASR), speech translation (ST), and text-to …
Attention, please! A survey of neural attention models in deep learning
A de Santana Correia, EL Colombini - Artificial Intelligence Review, 2022 - Springer
In humans, Attention is a core property of all perceptual and cognitive operations. Given our
limited ability to process competing sources, attention mechanisms select, modulate, and …
limited ability to process competing sources, attention mechanisms select, modulate, and …
Neural speech synthesis with transformer network
Although end-to-end neural text-to-speech (TTS) methods (such as Tacotron2) are proposed
and achieve state-of-theart performance, they still suffer from two problems: 1) low efficiency …
and achieve state-of-theart performance, they still suffer from two problems: 1) low efficiency …
Emformer: Efficient memory transformer based acoustic model for low latency streaming speech recognition
This paper proposes an efficient memory transformer Emformer for low latency streaming
speech recognition. In Emformer, the long-range history context is distilled into an …
speech recognition. In Emformer, the long-range history context is distilled into an …
Transformer-based acoustic modeling for hybrid speech recognition
We propose and evaluate transformer-based acoustic models (AMs) for hybrid speech
recognition. Several modeling choices are discussed in this work, including various …
recognition. Several modeling choices are discussed in this work, including various …
[PDF][PDF] Improving transformer-based end-to-end speech recognition with connectionist temporal classification and language model integration
T Nakatani - proc. INTERSPEECH, 2019 - isca-archive.org
The state-of-the-art neural network architecture named Transformer has been used
successfully for many sequence-tosequence transformation tasks. The advantage of this …
successfully for many sequence-tosequence transformation tasks. The advantage of this …
Root mean square layer normalization
B Zhang, R Sennrich - Advances in Neural Information …, 2019 - proceedings.neurips.cc
Layer normalization (LayerNorm) has been successfully applied to various deep neural
networks to help stabilize training and boost model convergence because of its capability in …
networks to help stabilize training and boost model convergence because of its capability in …
Transformers in speech processing: A survey
The remarkable success of transformers in the field of natural language processing has
sparked the interest of the speech-processing community, leading to an exploration of their …
sparked the interest of the speech-processing community, leading to an exploration of their …
[HTML][HTML] Thank you for attention: a survey on attention-based artificial neural networks for automatic speech recognition
Attention is a very popular and effective mechanism in artificial neural network-based
sequence-to-sequence models. In this survey paper, a comprehensive review of the different …
sequence-to-sequence models. In this survey paper, a comprehensive review of the different …