- 学术资源搜索

On efficient training of large-scale deep learning models: A literature review

L Shen, Y Sun, Z Yu, L Ding, X Tian, D Tao - arXiv preprint arXiv …, 2023 - arxiv.org

The field of deep learning has witnessed significant progress, particularly in computer vision
(CV), natural language processing (NLP), and speech. The use of large-scale models …

被引用次数：33 相关文章所有 2 个版本

[PDF] arxiv.org

Model stock: All we need is just a few fine-tuned models

DH Jang, S Yun, D Han - European Conference on Computer Vision, 2025 - Springer

This paper introduces an efficient fine-tuning method for large pre-trained models, offering
strong in-distribution (ID) and out-of-distribution (OOD) performance. Breaking away from …

被引用次数：13 相关文章所有 2 个版本

[PDF] arxiv.org

Model merging in llms, mllms, and beyond: Methods, theories, applications and opportunities

E Yang, L Shen, G Guo, X Wang, X Cao… - arXiv preprint arXiv …, 2024 - arxiv.org

Model merging is an efficient empowerment technique in the machine learning community
that does not require the collection of raw training data and does not require expensive …

被引用次数：30 相关文章所有 2 个版本

[PDF] arxiv.org

Deep model fusion: A survey

W Li, Y Peng, M Zhang, L Ding, H Hu… - arXiv preprint arXiv …, 2023 - arxiv.org

Deep model fusion/merging is an emerging technique that merges the parameters or
predictions of multiple deep learning models into a single one. It combines the abilities of …

被引用次数：47 相关文章所有 2 个版本

[PDF] acm.org

Badmerging: Backdoor attacks against model merging

J Zhang, J Chi, Z Li, K Cai, Y Zhang… - Proceedings of the 2024 on …, 2024 - dl.acm.org

Fine-tuning pre-trained models for downstream tasks has led to a proliferation of open-
sourced task-specific models. Recently, Model Merging (MM) has emerged as an effective …

被引用次数：7 相关文章所有 6 个版本

On Efficient Training of Large-Scale Deep Learning Models

L Shen, Y Sun, Z Yu, L Ding, X Tian, D Tao - ACM Computing Surveys, 2024 - dl.acm.org

The field of deep learning has witnessed significant progress in recent times, particularly in
areas such as computer vision (CV), natural language processing (NLP), and speech. The …

[PDF] arxiv.org

Learning scalable model soup on a single gpu: An efficient subspace training strategy

T Li, W Jiang, F Liu, X Huang, JT Kwok - European Conference on …, 2025 - Springer

Pre-training followed by fine-tuning is widely adopted among practitioners. The performance
can be improved by “model soups”[46] via exploring various hyperparameter configurations …

被引用次数：2 相关文章所有 5 个版本

[PDF] mlr.press

Better Loss Landscape Visualization for Deep Neural Networks with Trajectory Information

R Ding, T Li, X Huang - Asian Conference on Machine …, 2024 - proceedings.mlr.press

The loss landscape of neural networks is a valuable perspective for studying the trainability,
generalization, and robustness of networks, and hence its visualization has been …

被引用次数：1 相关文章

[PDF] openreview.net

Exponential moving average of weights in deep learning: Dynamics and benefits

D Morales-Brotons, T Vogels… - Transactions on Machine …, 2024 - openreview.net

Weight averaging of Stochastic Gradient Descent (SGD) iterates is a popular method for
training deep learning models. While it is often used as part of complex training pipelines to …

被引用次数：13 相关文章

[PDF] mlr.press

Ensembling improves stability and power of feature selection for deep learning models

PK Gyawali, X Liu, J Zou, Z He - Machine Learning in …, 2022 - proceedings.mlr.press

With the growing adoption of deep learning models in different real-world domains,
including computational biology, it is often necessary to understand which data features are …

被引用次数：7 相关文章所有 4 个版本