Layer-wise analysis of a self-supervised speech representation model
Recently proposed self-supervised learning approaches have been successful for pre-
training speech representation models. The utility of these learned representations has been …
training speech representation models. The utility of these learned representations has been …
Reconsidering read and spontaneous speech: Causal perspectives on the generation of training data for automatic speech recognition
Superficially, read and spontaneous speech—the two main kinds of training data for
automatic speech recognition—appear as complementary, but are equal: pairs of texts and …
automatic speech recognition—appear as complementary, but are equal: pairs of texts and …
Automatic pronunciation assessment using self-supervised speech representation learning
E Kim, JJ Jeon, H Seo, H Kim - arXiv preprint arXiv:2204.03863, 2022 - arxiv.org
Self-supervised learning (SSL) approaches such as wav2vec 2.0 and HuBERT models have
shown promising results in various downstream tasks in the speech community. In particular …
shown promising results in various downstream tasks in the speech community. In particular …
Understanding the role of self attention for efficient speech recognition
Self-attention (SA) is a critical component of Transformer neural networks that have
succeeded in automatic speech recognition (ASR). In this paper, we analyze the role of SA …
succeeded in automatic speech recognition (ASR). In this paper, we analyze the role of SA …
What do self-supervised speech models know about words?
Many self-supervised speech models (S3Ms) have been introduced over the last few years,
improving performance and data efficiency on various speech tasks. However, these …
improving performance and data efficiency on various speech tasks. However, these …
Deep versus wide: An analysis of student architectures for task-agnostic knowledge distillation of self-supervised speech models
Self-supervised learning (SSL) is seen as a very promising approach with high performance
for several speech downstream tasks. Since the parameters of SSL models are generally so …
for several speech downstream tasks. Since the parameters of SSL models are generally so …
What do self-supervised speech models know about words?
Many self-supervised speech models (S3Ms) have been introduced over the last few years,
producing performance and data efficiency improvements for a variety of speech tasks …
producing performance and data efficiency improvements for a variety of speech tasks …
Probing speech emotion recognition transformers for linguistic knowledge
Large, pre-trained neural networks consisting of self-attention layers (transformers) have
recently achieved state-of-the-art results on several speech emotion recognition (SER) …
recently achieved state-of-the-art results on several speech emotion recognition (SER) …
Automated recognition of alzheimer's dementia using bag-of-deep-features and model ensembling
Alzheimer's dementia is a progressive neurodegenerative disease that causes cognitive and
physical impairment. It severely deteriorates the quality of life in affected individuals. An …
physical impairment. It severely deteriorates the quality of life in affected individuals. An …
Evidence of vocal tract articulation in self-supervised learning of speech
Recent self-supervised learning (SSL) models have proven to learn rich representations of
speech, which can readily be utilized by diverse downstream tasks. To understand such …
speech, which can readily be utilized by diverse downstream tasks. To understand such …