Fusechat: Knowledge fusion of chat models

F Wan, L Zhong, Z Yang, R Chen, X Quan - arXiv preprint arXiv …, 2024 - arxiv.org
While training large language models (LLMs) from scratch can indeed lead to models with
distinct capabilities and strengths, it incurs substantial costs and may lead to redundancy in …

Decoding-time language model alignment with multiple objectives

R Shi, Y Chen, Y Hu, A Liu, H Hajishirzi… - arXiv preprint arXiv …, 2024 - arxiv.org
Aligning language models (LMs) to human preferences has emerged as a critical pursuit,
enabling these models to better serve diverse user needs. Existing methods primarily focus …

Strong copyright protection for language models via adaptive model fusion

J Abad, K Donhauser, F Pinto, F Yang - arXiv preprint arXiv:2407.20105, 2024 - arxiv.org
The risk of language models unintentionally reproducing copyrighted material from their
training data has led to the development of various protective measures. In this paper, we …

Cool-fusion: Fuse large language models without training

C Liu, X Quan, Y Pan, L Lin, W Wu, X Chen - arXiv preprint arXiv …, 2024 - arxiv.org
We focus on the problem of fusing two or more heterogeneous large language models
(LLMs) to facilitate their complementary strengths. One of the challenges on model fusion is …