[引用][C] H Chi, James Caverlee, Julian McAuley, and Derek Zhiyuan Cheng. How to train data-efficient llms

N Sachdeva, B Coleman, WC Kang, J Ni, L Hong - arXiv preprint arXiv:2402.09668, 2024

How to Train Data-Efficient LLMs

N Sachdeva, B Coleman, WC Kang, J Ni… - arXiv preprint arXiv …, 2024 - arxiv.org
The training of large language models (LLMs) is expensive. In this paper, we study data-
efficient approaches for pre-training LLMs, ie, techniques that aim to optimize the Pareto
frontier of model quality and training resource/data consumption. We seek to understand the
tradeoffs associated with data selection routines based on (i) expensive-to-compute data-
quality estimates, and (ii) maximization of coverage and diversity-based measures in the
feature space. Our first technique, Ask-LLM, leverages the zero-shot reasoning capabilities …
以上显示的是最相近的搜索结果。 查看全部搜索结果