Phil Hall, Hsuan-Jui Chen, Shang-Wen Li, Shinji Watanabe, Abdelrahman Mohamed, and Hung-yi...

AH Liu, HJ Chang, M Auli, WN Hsu… - Advances in Neural …, 2024 - proceedings.neurips.cc

In this paper, we introduce self-distillation and online clustering for self-supervised speech
representation learning (DinoSR) which combines masked language modeling, self …

被引用次数：10 相关文章所有 7 个版本

[PDF] arxiv.org

Lextreme: A multi-lingual and multi-task benchmark for the legal domain

J Niklaus, V Matoshi, P Rani, A Galassi… - arXiv preprint arXiv …, 2023 - arxiv.org

Lately, propelled by the phenomenal advances around the transformer architecture, the
legal NLP field has enjoyed spectacular growth. To measure progress, well curated and …

被引用次数：33 相关文章所有 10 个版本

[PDF] mit.edu

What do self-supervised speech models know about words?

A Pasad, CM Chien, S Settle, K Livescu - Transactions of the …, 2024 - direct.mit.edu

Many self-supervised speech models (S3Ms) have been introduced over the last few years,
improving performance and data efficiency on various speech tasks. However, these …

被引用次数：12 相关文章所有 4 个版本

[PDF] arxiv.org

What do self-supervised speech models know about words?

A Pasad, CM Chien, S Settle, K Livescu - arXiv preprint arXiv:2307.00162, 2023 - arxiv.org

Many self-supervised speech models (S3Ms) have been introduced over the last few years,
producing performance and data efficiency improvements for a variety of speech tasks …

被引用次数：11 相关文章所有 3 个版本

[PDF] arxiv.org

Pheme: Efficient and Conversational Speech Generation

P Budzianowski, T Sereda, T Cichy, I Vulić - arXiv preprint arXiv …, 2024 - arxiv.org

In recent years, speech generation has seen remarkable progress, now achieving one-shot
generation capability that is often virtually indistinguishable from real human voice …

被引用次数：6 相关文章所有 2 个版本

[PDF] arxiv.org

R-Spin: Efficient Speaker and Noise-invariant Representation Learning with Acoustic Pieces

HJ Chang, J Glass - arXiv preprint arXiv:2311.09117, 2023 - arxiv.org

This paper introduces Robust Spin (R-Spin), a data-efficient self-supervised fine-tuning
framework for speaker and noise-invariant speech representations by learning discrete …

被引用次数：1 相关文章所有 3 个版本

[PDF] acm.org

Evolutionary Multi-objective Optimization for Contextual Adversarial Example Generation

S Zhou, M Huang, Y Sun, K Li - Proceedings of the ACM on Software …, 2024 - dl.acm.org

The emergence of the'code naturalness' concept, which suggests that software code shares
statistical properties with natural language, paves the way for deep neural networks (DNNs) …

被引用次数：1 相关文章