Disperse-Then-Merge: Pushing the Limits of Instruction Tuning via Alignment Tax Reduction

E Yang, L Shen, G Guo, X Wang, X Cao… - arXiv preprint arXiv …, 2024 - arxiv.org

Model merging is an efficient empowerment technique in the machine learning community
that does not require the collection of raw training data and does not require expensive …

被引用次数：35 相关文章所有 2 个版本

[PDF] arxiv.org

On the loss of context-awareness in general instruction fine-tuning

Y Wang, A Bai, N Peng, CJ Hsieh - arXiv preprint arXiv:2411.02688, 2024 - arxiv.org

Pretrained Large Language Models (LLMs) require post-training methods such as
supervised fine-tuning (SFT) on instruction-response pairs to enable instruction following …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Entropic Distribution Matching in Supervised Fine-tuning of LLMs: Less Overfitting and Better Diversity

Z Li, C Chen, T Xu, Z Qin, J Xiao, R Sun… - arXiv preprint arXiv …, 2024 - arxiv.org

Large language models rely on Supervised Fine-Tuning (SFT) to specialize in downstream
tasks. Cross Entropy (CE) loss is the de facto choice in SFT, but it often leads to overfitting …

被引用次数：1 相关文章所有 4 个版本

[PDF] arxiv.org

FlipGuard: Defending Preference Alignment against Update Regression with Constrained Optimization

M Zhu, Y Liu, Q Wang, J Guo, Z Mao - arXiv preprint arXiv:2410.00508, 2024 - arxiv.org

Recent breakthroughs in preference alignment have significantly improved Large Language
Models' ability to generate texts that align with human preferences and values. However …

VickreyFeedback: Cost-efficient Data Construction for Reinforcement Learning from Human Feedback

G Zhang, J Duan - International Conference on Principles and Practice of …, 2024 - Springer

This paper addresses the cost-efficiency aspect of Reinforcement Learning from Human
Feedback (RLHF). RLHF leverages datasets of human preferences over outputs of LLMs to …