Tera: Self-supervised learning of transformer encoder representation for speech
We introduce a self-supervised speech pre-training method called TERA, which stands for
Transformer Encoder Representations from Alteration. Recent approaches often learn by …
Transformer Encoder Representations from Alteration. Recent approaches often learn by …
Survey on deep neural networks in speech and vision systems
This survey presents a review of state-of-the-art deep neural network architectures,
algorithms, and systems in speech and vision applications. Recent advances in deep …
algorithms, and systems in speech and vision applications. Recent advances in deep …
Thchs-30: A free chinese speech corpus
D Wang, X Zhang - arXiv preprint arXiv:1512.01882, 2015 - arxiv.org
Speech data is crucially important for speech recognition research. There are quite some
speech databases that can be purchased at prices that are reasonable for most research …
speech databases that can be purchased at prices that are reasonable for most research …
A review of shorthand systems: From brachygraphy to microtext and beyond
Human civilizations have performed the art of writing across continents and over different
time periods. In order to speed up the writing process, the art of shorthand (brachygraphy) …
time periods. In order to speed up the writing process, the art of shorthand (brachygraphy) …
[HTML][HTML] Sequence modeling with ctc
A Hannun - Distill, 2017 - distill.pub
Consider speech recognition. We have a dataset of audio clips and corresponding
transcripts. Unfortunately, we don't know how the characters in the transcript align to the …
transcripts. Unfortunately, we don't know how the characters in the transcript align to the …
Rnndrop: A novel dropout for rnns in asr
Recently, recurrent neural networks (RNN) have achieved the state-of-the-art performance
in several applications that deal with temporal data, eg, speech recognition, handwriting …
in several applications that deal with temporal data, eg, speech recognition, handwriting …
A survey of recent DNN architectures on the TIMIT phone recognition task
J Michalek, J Vaněk - Text, Speech, and Dialogue: 21st International …, 2018 - Springer
In this survey paper, we have evaluated several recent deep neural network (DNN)
architectures on a TIMIT phone recognition task. We chose the TIMIT corpus due to its …
architectures on a TIMIT phone recognition task. We chose the TIMIT corpus due to its …
Towards quantum language models
I Basile, F Tamburini - Proceedings of the 2017 Conference on …, 2017 - aclanthology.org
This paper presents a new approach for building Language Models using the Quantum
Probability Theory, a Quantum Language Model (QLM). It mainly shows that relying on this …
Probability Theory, a Quantum Language Model (QLM). It mainly shows that relying on this …
Community-supported shared infrastructure in support of speech accessibility
M Hasegawa-Johnson, X Zheng, H Kim… - Journal of Speech …, 2024 - pubs.asha.org
Purpose: The Speech Accessibility Project (SAP) intends to facilitate research and
development in automatic speech recognition (ASR) and other machine learning tasks for …
development in automatic speech recognition (ASR) and other machine learning tasks for …
Cascaded tuning to amplitude modulation for natural sound recognition
T Koumura, H Terashima, S Furukawa - Journal of Neuroscience, 2019 - Soc Neuroscience
The auditory system converts the physical properties of a sound waveform to neural
activities and processes them for recognition. During the process, the tuning to amplitude …
activities and processes them for recognition. During the process, the tuning to amplitude …