[PDF][PDF] Audio self-supervised learning: A survey
Similar to humans' cognitive ability to generalize knowledge and skills, self-supervised
learning (SSL) targets discovering general representations from large-scale data. This …
learning (SSL) targets discovering general representations from large-scale data. This …
Wavlm: Large-scale self-supervised pre-training for full stack speech processing
Self-supervised learning (SSL) achieves great success in speech recognition, while limited
exploration has been attempted for other speech processing tasks. As speech signal …
exploration has been attempted for other speech processing tasks. As speech signal …
Comparative layer-wise analysis of self-supervised speech models
Many self-supervised speech models, varying in their pre-training objective, input modality,
and pre-training data, have been proposed in the last few years. Despite impressive …
and pre-training data, have been proposed in the last few years. Despite impressive …
ML-SUPERB: Multilingual speech universal performance benchmark
Speech processing Universal PERformance Benchmark (SUPERB) is a leaderboard to
benchmark the performance of Self-Supervised Learning (SSL) models on various speech …
benchmark the performance of Self-Supervised Learning (SSL) models on various speech …
Speechprompt: An exploration of prompt tuning on generative spoken language model for speech processing tasks
Speech representations learned from Self-supervised learning (SSL) models can benefit
various speech processing tasks. However, utilizing SSL representations usually requires …
various speech processing tasks. However, utilizing SSL representations usually requires …
Superb@ slt 2022: Challenge on generalization and efficiency of self-supervised speech representation learning
We present the SUPERB challenge at SLT 2022, which aims at learning self-supervised
speech representation for better performance, generalization, and efficiency. The challenge …
speech representation for better performance, generalization, and efficiency. The challenge …
On the utility of self-supervised models for prosody-related tasks
Self-Supervised Learning (SSL) from speech data has produced models that have achieved
remarkable performance in many tasks, and that are known to implicitly represent many …
remarkable performance in many tasks, and that are known to implicitly represent many …
What do self-supervised speech models know about words?
Many self-supervised speech models (S3Ms) have been introduced over the last few years,
improving performance and data efficiency on various speech tasks. However, these …
improving performance and data efficiency on various speech tasks. However, these …
Speechclip: Integrating speech with pre-trained vision and language model
Data-driven speech processing models usually perform well with a large amount of text
supervision, but collecting transcribed speech data is costly. Therefore, we propose Speech …
supervision, but collecting transcribed speech data is costly. Therefore, we propose Speech …
Boosting self-supervised embeddings for speech enhancement
Self-supervised learning (SSL) representation for speech has achieved state-of-the-art
(SOTA) performance on several downstream tasks. However, there remains room for …
(SOTA) performance on several downstream tasks. However, there remains room for …