A survey on data selection for language models
A major factor in the recent success of large language models is the use of enormous and
ever-growing text datasets for unsupervised pre-training. However, naively training a model …
ever-growing text datasets for unsupervised pre-training. However, naively training a model …
Qlora: Efficient finetuning of quantized llms
We present QLoRA, an efficient finetuning approach that reduces memory usage enough to
finetune a 65B parameter model on a single 48GB GPU while preserving full 16-bit …
finetune a 65B parameter model on a single 48GB GPU while preserving full 16-bit …
A survey of large language models
Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …
The flan collection: Designing data and methods for effective instruction tuning
We study the design decision of publicly available instruction tuning methods, by
reproducing and breaking down the development of Flan 2022 (Chung et al., 2022) …
reproducing and breaking down the development of Flan 2022 (Chung et al., 2022) …
A survey on in-context learning
With the increasing capabilities of large language models (LLMs), in-context learning (ICL)
has emerged as a new paradigm for natural language processing (NLP), where LLMs make …
has emerged as a new paradigm for natural language processing (NLP), where LLMs make …
Scaling data-constrained language models
The current trend of scaling language models involves increasing both parameter count and
training dataset size. Extrapolating this trend suggests that training dataset size may soon be …
training dataset size. Extrapolating this trend suggests that training dataset size may soon be …
Scaling instruction-finetuned language models
Finetuning language models on a collection of datasets phrased as instructions has been
shown to improve model performance and generalization to unseen tasks. In this paper we …
shown to improve model performance and generalization to unseen tasks. In this paper we …
Crosslingual generalization through multitask finetuning
Multitask prompted finetuning (MTF) has been shown to help large language models
generalize to new tasks in a zero-shot setting, but so far explorations of MTF have focused …
generalize to new tasks in a zero-shot setting, but so far explorations of MTF have focused …
Challenging big-bench tasks and whether chain-of-thought can solve them
BIG-Bench (Srivastava et al., 2022) is a diverse evaluation suite that focuses on tasks
believed to be beyond the capabilities of current language models. Language models have …
believed to be beyond the capabilities of current language models. Language models have …
Larger language models do in-context learning differently
We study how in-context learning (ICL) in language models is affected by semantic priors
versus input-label mappings. We investigate two setups-ICL with flipped labels and ICL with …
versus input-label mappings. We investigate two setups-ICL with flipped labels and ICL with …