Sujan Reddy A, Sumanta Patro, Tanay Dixit, and Xudong Shen. 2022. Super-NaturalInstructions:...

B Min, H Ross, E Sulem, APB Veyseh… - ACM Computing …, 2023 - dl.acm.org

Large, pre-trained language models (PLMs) such as BERT and GPT have drastically
changed the Natural Language Processing (NLP) field. For numerous NLP tasks …

被引用次数：1059 相关文章所有 5 个版本

[PDF] arxiv.org

A survey on data selection for language models

A Albalak, Y Elazar, SM Xie, S Longpre… - arXiv preprint arXiv …, 2024 - arxiv.org

A major factor in the recent success of large language models is the use of enormous and
ever-growing text datasets for unsupervised pre-training. However, naively training a model …

被引用次数：69 相关文章所有 2 个版本

[PDF] neurips.cc

Visual instruction tuning

H Liu, C Li, Q Wu, YJ Lee - Advances in neural information …, 2024 - proceedings.neurips.cc

Instruction tuning large language models (LLMs) using machine-generated instruction-
following data has been shown to improve zero-shot capabilities on new tasks, but the idea …

被引用次数：4701 相关文章所有 15 个版本

[PDF] jmlr.org

Scaling instruction-finetuned language models

HW Chung, L Hou, S Longpre, B Zoph, Y Tay… - Journal of Machine …, 2024 - jmlr.org

Finetuning language models on a collection of datasets phrased as instructions has been
shown to improve model performance and generalization to unseen tasks. In this paper we …

被引用次数：3123 相关文章所有 3 个版本

[PDF] arxiv.org

Factscore: Fine-grained atomic evaluation of factual precision in long form text generation

S Min, K Krishna, X Lyu, M Lewis, W Yih… - arXiv preprint arXiv …, 2023 - arxiv.org

Evaluating the factuality of long-form text generated by large language models (LMs) is non-
trivial because (1) generations often contain a mixture of supported and unsupported pieces …

被引用次数：440 相关文章所有 8 个版本

[PDF] openreview.net

Ultrafeedback: Boosting language models with high-quality feedback

G Cui, L Yuan, N Ding, G Yao, W Zhu, Y Ni, G Xie, Z Liu… - 2023 - openreview.net

Reinforcement learning from human feedback (RLHF) has become a pivot technique in
aligning large language models (LLMs) with human preferences. In RLHF practice …

被引用次数：229 相关文章所有 2 个版本

[PDF] arxiv.org

Aligning large language models with human: A survey

Y Wang, W Zhong, L Li, F Mi, X Zeng, W Huang… - arXiv preprint arXiv …, 2023 - arxiv.org

Large Language Models (LLMs) trained on extensive textual corpora have emerged as
leading solutions for a broad array of Natural Language Processing (NLP) tasks. Despite …

被引用次数：272 相关文章所有 2 个版本

[PDF] arxiv.org

Chatgpt beyond english: Towards a comprehensive evaluation of large language models in multilingual learning

VD Lai, NT Ngo, APB Veyseh, H Man… - arXiv preprint arXiv …, 2023 - arxiv.org

Over the last few years, large language models (LLMs) have emerged as the most important
breakthroughs in natural language processing (NLP) that fundamentally transform research …

被引用次数：238 相关文章所有 8 个版本

[PDF] neurips.cc

Rewarded soups: towards pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards

A Rame, G Couairon, C Dancette… - Advances in …, 2024 - proceedings.neurips.cc

Foundation models are first pre-trained on vast unsupervised datasets and then fine-tuned
on labeled data. Reinforcement learning, notably from human feedback (RLHF), can further …

被引用次数：94 相关文章所有 7 个版本

[PDF] arxiv.org

Reasoning or reciting? exploring the capabilities and limitations of language models through counterfactual tasks

Z Wu, L Qiu, A Ross, E Akyürek, B Chen… - arXiv preprint arXiv …, 2023 - arxiv.org

The impressive performance of recent language models across a wide range of tasks
suggests that they possess a degree of abstract reasoning skills. Are these skills general …

被引用次数：150 相关文章所有 4 个版本