Pretext tasks selection for multitask self-supervised audio representation learning

Beyond just vision: A review on self-supervised representation learning on multimodal and temporal data

S Deldari, H Xue, A Saeed, J He, DV Smith… - arXiv preprint arXiv …, 2022 - arxiv.org

Recently, Self-Supervised Representation Learning (SSRL) has attracted much attention in
the field of computer vision, speech, natural language processing (NLP), and recently, with …

被引用次数：36 相关文章所有 2 个版本

[PDF] arxiv.org

Speech self-supervised representation benchmarking: Are we doing it right?

S Zaiem, Y Kemiche, T Parcollet, S Essid… - arXiv preprint arXiv …, 2023 - arxiv.org

Self-supervised learning (SSL) has recently allowed leveraging large datasets of unlabeled
speech signals to reach impressive performance on speech tasks using only small amounts …

被引用次数：22 相关文章所有 8 个版本

[PDF] mlr.press

Improved active multi-task representation learning via lasso

Y Wang, Y Chen, K Jamieson… - … Conference on Machine …, 2023 - proceedings.mlr.press

To leverage the copious amount of data from source tasks and overcome the scarcity of the
target task samples, representation learning based on multi-task pretraining has become a …

被引用次数：9 相关文章所有 8 个版本

[PDF] arxiv.org

Speech self-supervised representations benchmarking: a case for larger probing heads

S Zaiem, Y Kemiche, T Parcollet, S Essid… - arXiv preprint arXiv …, 2023 - arxiv.org

Self-supervised learning (SSL) leverages large datasets of unlabeled speech to reach
impressive performance with reduced amounts of annotated data. The high number of …

被引用次数：9 相关文章所有 6 个版本

[PDF] neurips.cc

Losses can be blessings: Routing self-supervised speech representations towards efficient multilingual and multitask speech processing

Y Fu, Y Zhang, K Qian, Z Ye, Z Yu… - Advances in Neural …, 2022 - proceedings.neurips.cc

Self-supervised learning (SSL) for rich speech representations has achieved empirical
success in low-resource Automatic Speech Recognition (ASR) and other speech processing …

被引用次数：5 相关文章所有 9 个版本

[PDF] arxiv.org

Fine-tuning strategies for faster inference using speech self-supervised models: a comparative study

S Zaiem, R Algayres, T Parcollet… - … , Speech, and Signal …, 2023 - ieeexplore.ieee.org

Self-supervised learning (SSL) has allowed substantial progress in Automatic Speech
Recognition (ASR) performance in low-resource settings. In this context, it has been …

被引用次数：14 相关文章所有 33 个版本

[HTML] nature.com

[HTML][HTML] Creating musical features using multi-faceted, multi-task encoders based on transformers

T Greer, X Shi, B Ma, S Narayanan - Scientific Reports, 2023 - nature.com

Computational machine intelligence approaches have enabled a variety of music-centric
technologies in support of creating, sharing and interacting with music content. A strong …

被引用次数：1 相关文章所有 8 个版本

[PDF] arxiv.org

Benchmarking Representations for Speech, Music, and Acoustic Events

M La Quatra, A Koudounas, L Vaiani, E Baralis… - arXiv preprint arXiv …, 2024 - arxiv.org

Limited diversity in standardized benchmarks for evaluating audio representation learning
(ARL) methods may hinder systematic comparison of current methods' capabilities. We …

被引用次数：3 相关文章所有 2 个版本

Facial expression recognition based on zero-addition pretext training and feature conjunction-selection network in human-robot interaction

CS Jiang, ZT Liu, J She - IEEE Sensors Journal, 2023 - ieeexplore.ieee.org

The design of the feature extraction process and training strategy are crucial aspects of
achieving high-performance facial expression recognition (FER). Although the introduction …

被引用次数：2 相关文章所有 2 个版本

[PDF] thecvf.com

Sound and visual representation learning with multiple pretraining tasks

AB Vasudevan, D Dai… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com

Different self-supervised tasks (SSL) reveal different features from the data. The learned
feature representations can exhibit different performance for each downstream task. In this …

被引用次数：7 相关文章所有 11 个版本