Probing structured pruning on multilingual pre-trained models: Settings, algorithms, and efficiency

文章

学术资源搜索

获得 4 条结果（用时0.02秒）

我的图书馆

Probing structured pruning on multilingual pre-trained models: Settings, algorithms, and efficiency

在引用文章中搜索

[PDF] arxiv.org

Examining modularity in multilingual lms via language-specialized subnetworks

R Choenni, E Shutova, D Garrette - arXiv preprint arXiv:2311.08273, 2023 - arxiv.org

Recent work has proposed explicitly inducing language-wise modularity in multilingual LMs
via sparse fine-tuning (SFT) on per-language subnetworks as a means of better guiding …

被引用次数：9 相关文章所有 3 个版本

[PDF] arxiv.org

Gradient-based intra-attention pruning on pre-trained language models

Z Yang, Y Cui, X Yao, S Wang - arXiv preprint arXiv:2212.07634, 2022 - arxiv.org

Pre-trained language models achieve superior performance but are computationally
expensive. Techniques such as pruning and knowledge distillation have been developed to …

被引用次数：8 相关文章所有 6 个版本

[PDF] mit.edu

Cross-Lingual Transfer with Language-Specific Subnetworks for Low-Resource Dependency Parsing

R Choenni, D Garrette, E Shutova - Computational Linguistics, 2023 - direct.mit.edu

Large multilingual language models typically share their parameters across all languages,
which enables cross-lingual task transfer, but learning can also be hindered when training …

被引用次数：6 相关文章所有 8 个版本

[PDF] uva.nl

Multilinguality and Multiculturalism: Towards more Effective and Inclusive Neural Language Models

R Choenni - 2025 - eprints.illc.uva.nl

Large-scale pretraining requires vast amounts of text in a given language, which limits the
applicability of such techniques to a handful of high-resource languages. Therefore …