Training neural networks from scratch with parallel low-rank adapters

Y Mao, Y Ge, Y Fan, W Xu, Y Mi, Z Hu… - Frontiers of Computer …, 2025 - Springer

Abstract Low-Rank Adaptation (LoRA), which updates the dense neural network layers with
pluggable low-rank matrices, is one of the best performed parameter efficient fine-tuning …

被引用次数：7 相关文章所有 4 个版本

[PDF] arxiv.org

Pareto low-rank adapters: Efficient multi-task learning with preferences

N Dimitriadis, P Frossard, F Fleuret - arXiv preprint arXiv:2407.08056, 2024 - arxiv.org

Dealing with multi-task trade-offs during inference can be addressed via Pareto Front
Learning (PFL) methods that parameterize the Pareto Front with a single model, contrary to …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

MoIN: Mixture of Introvert Experts to Upcycle an LLM

A Tejankar, KL Navaneet, U Panchal… - arXiv preprint arXiv …, 2024 - arxiv.org

The goal of this paper is to improve (upcycle) an existing large language model without the
prohibitive requirements of continued pre-training of the full-model. The idea is to split the …

SLTrain: a sparse plus low-rank approach for parameter and memory efficient pretraining

A Han, J Li, W Huang, M Hong, A Takeda… - arXiv preprint arXiv …, 2024 - arxiv.org

Large language models (LLMs) have shown impressive capabilities across various tasks.
However, training LLMs from scratch requires significant computational power and extensive …

被引用次数：3 相关文章所有 2 个版本

[PDF] arxiv.org

CoMERA: Computing-and Memory-Efficient Training via Rank-Adaptive Tensor Optimization

Z Yang, S Choudhary, X Xie, C Gao… - arXiv preprint arXiv …, 2024 - arxiv.org

Training large AI models such as deep learning recommendation systems and foundation
language (or multi-modal) models costs massive GPUs and computing time. The high …

被引用次数：3 相关文章所有 2 个版本

[PDF] arxiv.org

Neural Precision Polarization: Simplifying Neural Network Inference with Dual-Level Precision

D Jayasuriya, N Darabi, MB Hashem… - arXiv preprint arXiv …, 2024 - arxiv.org

We introduce a precision polarization scheme for DNN inference that utilizes only very low
and very high precision levels, assigning low precision to the majority of network weights …

[PDF][PDF] Information Propagation in Modular Language Modeling and Web Tracking

Z Su - 2024 - di.ku.dk

Abstract Information propagation is the process through which data are transmitted within a
system. The growth of large-scale web datasets has led to explosive growth in information …