SUPERB-SG: Enhanced speech processing universal PERformance benchmark for semantic and generative...

[PDF][PDF] Audio self-supervised learning: A survey

S Liu, A Mallol-Ragolta, E Parada-Cabaleiro, K Qian… - Patterns, 2022 - cell.com

Similar to humans' cognitive ability to generalize knowledge and skills, self-supervised
learning (SSL) targets discovering general representations from large-scale data. This …

被引用次数：90 相关文章所有 12 个版本

[PDF] arxiv.org

Wavlm: Large-scale self-supervised pre-training for full stack speech processing

S Chen, C Wang, Z Chen, Y Wu, S Liu… - IEEE Journal of …, 2022 - ieeexplore.ieee.org

Self-supervised learning (SSL) achieves great success in speech recognition, while limited
exploration has been attempted for other speech processing tasks. As speech signal …

被引用次数：1109 相关文章所有 9 个版本

[PDF] arxiv.org

Comparative layer-wise analysis of self-supervised speech models

A Pasad, B Shi, K Livescu - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org

Many self-supervised speech models, varying in their pre-training objective, input modality,
and pre-training data, have been proposed in the last few years. Despite impressive …

被引用次数：56 相关文章所有 3 个版本

[PDF] arxiv.org

ML-SUPERB: Multilingual speech universal performance benchmark

J Shi, D Berrebbi, W Chen, HL Chung, EP Hu… - arXiv preprint arXiv …, 2023 - arxiv.org

Speech processing Universal PERformance Benchmark (SUPERB) is a leaderboard to
benchmark the performance of Self-Supervised Learning (SSL) models on various speech …

被引用次数：27 相关文章所有 8 个版本

[PDF] arxiv.org

Speechprompt: An exploration of prompt tuning on generative spoken language model for speech processing tasks

KW Chang, WC Tseng, SW Li, H Lee - arXiv preprint arXiv:2203.16773, 2022 - arxiv.org

Speech representations learned from Self-supervised learning (SSL) models can benefit
various speech processing tasks. However, utilizing SSL representations usually requires …

被引用次数：39 相关文章所有 6 个版本

[PDF] arxiv.org

Superb@ slt 2022: Challenge on generalization and efficiency of self-supervised speech representation learning

T Feng, A Dong, CF Yeh, S Yang, TQ Lin… - 2022 IEEE Spoken …, 2023 - ieeexplore.ieee.org

We present the SUPERB challenge at SLT 2022, which aims at learning self-supervised
speech representation for better performance, generalization, and efficiency. The challenge …

被引用次数：28 相关文章所有 5 个版本

[PDF] arxiv.org

On the utility of self-supervised models for prosody-related tasks

GT Lin, CL Feng, WP Huang, Y Tseng… - 2022 IEEE Spoken …, 2023 - ieeexplore.ieee.org

Self-Supervised Learning (SSL) from speech data has produced models that have achieved
remarkable performance in many tasks, and that are known to implicitly represent many …

被引用次数：29 相关文章所有 5 个版本

[PDF] mit.edu

What do self-supervised speech models know about words?

A Pasad, CM Chien, S Settle, K Livescu - Transactions of the …, 2024 - direct.mit.edu

Many self-supervised speech models (S3Ms) have been introduced over the last few years,
improving performance and data efficiency on various speech tasks. However, these …

被引用次数：3 相关文章所有 4 个版本

Speechclip: Integrating speech with pre-trained vision and language model

YJ Shih, HF Wang, HJ Chang, L Berry… - 2022 IEEE Spoken …, 2023 - ieeexplore.ieee.org

Data-driven speech processing models usually perform well with a large amount of text
supervision, but collecting transcribed speech data is costly. Therefore, we propose Speech …

被引用次数：22 相关文章所有 5 个版本

[PDF] arxiv.org

Boosting self-supervised embeddings for speech enhancement

KH Hung, S Fu, HH Tseng, HT Chiang, Y Tsao… - arXiv preprint arXiv …, 2022 - arxiv.org

Self-supervised learning (SSL) representation for speech has achieved state-of-the-art
(SOTA) performance on several downstream tasks. However, there remains room for …

被引用次数：29 相关文章所有 6 个版本