Examining modularity in multilingual lms via language-specialized subnetworks
Recent work has proposed explicitly inducing language-wise modularity in multilingual LMs
via sparse fine-tuning (SFT) on per-language subnetworks as a means of better guiding …
via sparse fine-tuning (SFT) on per-language subnetworks as a means of better guiding …
Gradient-based intra-attention pruning on pre-trained language models
Pre-trained language models achieve superior performance but are computationally
expensive. Techniques such as pruning and knowledge distillation have been developed to …
expensive. Techniques such as pruning and knowledge distillation have been developed to …
Cross-Lingual Transfer with Language-Specific Subnetworks for Low-Resource Dependency Parsing
Large multilingual language models typically share their parameters across all languages,
which enables cross-lingual task transfer, but learning can also be hindered when training …
which enables cross-lingual task transfer, but learning can also be hindered when training …
Multilinguality and Multiculturalism: Towards more Effective and Inclusive Neural Language Models
R Choenni - 2025 - eprints.illc.uva.nl
Large-scale pretraining requires vast amounts of text in a given language, which limits the
applicability of such techniques to a handful of high-resource languages. Therefore …
applicability of such techniques to a handful of high-resource languages. Therefore …