Revisiting permutation symmetry for merging models between different datasets

D Tam, M Li, P Yadav, RB Gabrielsson… - NeurIPS 2024 …, 2024 - openreview.net

Training high-performing large language models (LLMs) from scratch is a notoriously
expensive and difficult task, costing hundreds of millions of dollars in compute alone. These …

被引用次数：4 相关文章所有 3 个版本

[PDF] arxiv.org

PLeaS--Merging Models with Permutations and Least Squares

A Nasery, J Hayase, PW Koh, S Oh - arXiv preprint arXiv:2407.02447, 2024 - arxiv.org

The democratization of machine learning systems has made the process of fine-tuning
accessible to a large number of practitioners, leading to a wide range of open-source …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

Sok: On finding common ground in loss landscapes using deep model merging techniques

A Khan, T Nief, N Hudson, M Sakarvadia… - arXiv preprint arXiv …, 2024 - arxiv.org

Understanding neural networks is crucial to creating reliable and trustworthy deep learning
models. Most contemporary research in interpretability analyzes just one model at a time via …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Layer-wise Model Merging for Unsupervised Domain Adaptation in Segmentation Tasks

R Alcover-Couso, JC SanMiguel… - arXiv preprint arXiv …, 2024 - arxiv.org

Merging parameters of multiple models has resurfaced as an effective strategy to enhance
task performance and robustness, but prior work is limited by the high costs of ensemble …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Rethink Model Re-Basin and the Linear Mode Connectivity

X Qu, S Horvath - arXiv preprint arXiv:2402.05966, 2024 - arxiv.org

Recent studies suggest that with sufficiently wide models, most SGD solutions can, up to
permutation, converge into the same basin. This phenomenon, known as the model re-basin …

被引用次数：3 相关文章所有 3 个版本