A survey on lora of large language models
Y Mao, Y Ge, Y Fan, W Xu, Y Mi, Z Hu… - Frontiers of Computer …, 2025 - Springer
Abstract Low-Rank Adaptation (LoRA), which updates the dense neural network layers with
pluggable low-rank matrices, is one of the best performed parameter efficient fine-tuning …
pluggable low-rank matrices, is one of the best performed parameter efficient fine-tuning …
Pareto low-rank adapters: Efficient multi-task learning with preferences
N Dimitriadis, P Frossard, F Fleuret - arXiv preprint arXiv:2407.08056, 2024 - arxiv.org
Dealing with multi-task trade-offs during inference can be addressed via Pareto Front
Learning (PFL) methods that parameterize the Pareto Front with a single model, contrary to …
Learning (PFL) methods that parameterize the Pareto Front with a single model, contrary to …
MoIN: Mixture of Introvert Experts to Upcycle an LLM
The goal of this paper is to improve (upcycle) an existing large language model without the
prohibitive requirements of continued pre-training of the full-model. The idea is to split the …
prohibitive requirements of continued pre-training of the full-model. The idea is to split the …
SLTrain: a sparse plus low-rank approach for parameter and memory efficient pretraining
Large language models (LLMs) have shown impressive capabilities across various tasks.
However, training LLMs from scratch requires significant computational power and extensive …
However, training LLMs from scratch requires significant computational power and extensive …
CoMERA: Computing-and Memory-Efficient Training via Rank-Adaptive Tensor Optimization
Z Yang, S Choudhary, X Xie, C Gao… - arXiv preprint arXiv …, 2024 - arxiv.org
Training large AI models such as deep learning recommendation systems and foundation
language (or multi-modal) models costs massive GPUs and computing time. The high …
language (or multi-modal) models costs massive GPUs and computing time. The high …
Neural Precision Polarization: Simplifying Neural Network Inference with Dual-Level Precision
We introduce a precision polarization scheme for DNN inference that utilizes only very low
and very high precision levels, assigning low precision to the majority of network weights …
and very high precision levels, assigning low precision to the majority of network weights …
[PDF][PDF] Information Propagation in Modular Language Modeling and Web Tracking
Z Su - 2024 - di.ku.dk
Abstract Information propagation is the process through which data are transmitted within a
system. The growth of large-scale web datasets has led to explosive growth in information …
system. The growth of large-scale web datasets has led to explosive growth in information …