A comprehensive overview of large language models

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …

Neural prompt search

Y Zhang, K Zhou, Z Liu - arXiv preprint arXiv:2206.04673, 2022 - arxiv.org
The size of vision models has grown exponentially over the last few years, especially after
the emergence of Vision Transformer. This has motivated the development of parameter …

On the effectiveness of parameter-efficient fine-tuning

Z Fu, H Yang, AMC So, W Lam, L Bing… - Proceedings of the AAAI …, 2023 - ojs.aaai.org
Fine-tuning pre-trained models has been ubiquitously proven to be effective in a wide range
of NLP tasks. However, fine-tuning the whole model is parameter inefficient as it always …

Exploring adapter-based transfer learning for recommender systems: Empirical studies and practical insights

J Fu, F Yuan, Y Song, Z Yuan, M Cheng… - Proceedings of the 17th …, 2024 - dl.acm.org
Adapters, a plug-in neural network module with some tunable parameters, have emerged as
a parameter-efficient transfer learning technique for adapting pre-trained models to …

Contrastive graph prompt-tuning for cross-domain recommendation

Z Yi, I Ounis, C Macdonald - ACM Transactions on Information Systems, 2023 - dl.acm.org
Recommender systems commonly suffer from the long-standing data sparsity problem
where insufficient user-item interaction data limits the systems' ability to make accurate …

Fusing finetuned models for better pretraining

L Choshen, E Venezian, N Slonim, Y Katz - arXiv preprint arXiv …, 2022 - arxiv.org
Pretrained models are the standard starting point for training. This approach consistently
outperforms the use of a random initialization. However, pretraining is a costly endeavour …

[HTML][HTML] AutoPEFT: Automatic Configuration Search for Parameter-Efficient Fine-Tuning

H Zhou, X Wan, I Vulić, A Korhonen - Transactions of the Association …, 2024 - direct.mit.edu
Large pretrained language models are widely used in downstream NLP tasks via task-
specific fine-tuning, but such procedures can be costly. Recently, Parameter-Efficient Fine …

Cold fusion: Collaborative descent for distributed multitask finetuning

S Don-Yehiya, E Venezian, C Raffel, N Slonim… - arXiv preprint arXiv …, 2022 - arxiv.org
We propose a new paradigm to continually evolve pretrained models, denoted ColD Fusion.
It provides the benefits of multitask learning but leverages distributed computation with …

Deep model fusion: A survey

W Li, Y Peng, M Zhang, L Ding, H Hu… - arXiv preprint arXiv …, 2023 - arxiv.org
Deep model fusion/merging is an emerging technique that merges the parameters or
predictions of multiple deep learning models into a single one. It combines the abilities of …

MixPHM: redundancy-aware parameter-efficient tuning for low-resource visual question answering

J Jiang, N Zheng - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com
Recently, finetuning pretrained vision-language models (VLMs) has been a prevailing
paradigm for achieving state-of-the-art performance in VQA. However, as VLMs scale, it …