Cosmos QA: Machine reading comprehension with contextual commonsense reasoning

K Mahowald, AA Ivanova, IA Blank, N Kanwisher… - Trends in Cognitive …, 2024 - cell.com

Large language models (LLMs) have come closest among all models to date to mastering
human language, yet opinions about their linguistic and cognitive capabilities remain split …

被引用次数：327 相关文章所有 10 个版本

[PDF] acm.org

Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing

P Liu, W Yuan, J Fu, Z Jiang, H Hayashi… - ACM Computing …, 2023 - dl.acm.org

This article surveys and organizes research works in a new paradigm in natural language
processing, which we dub “prompt-based learning.” Unlike traditional supervised learning …

被引用次数：4129 相关文章所有 4 个版本

[PDF] mlr.press

The flan collection: Designing data and methods for effective instruction tuning

S Longpre, L Hou, T Vu, A Webson… - International …, 2023 - proceedings.mlr.press

We study the design decision of publicly available instruction tuning methods, by
reproducing and breaking down the development of Flan 2022 (Chung et al., 2022) …

被引用次数：510 相关文章所有 8 个版本

[PDF] neurips.cc

Deep bidirectional language-knowledge graph pretraining

M Yasunaga, A Bosselut, H Ren… - Advances in …, 2022 - proceedings.neurips.cc

Pretraining a language model (LM) on text has been shown to help various downstream
NLP tasks. Recent works show that a knowledge graph (KG) can complement text data …

被引用次数：189 相关文章所有 15 个版本

[PDF] arxiv.org

Finetuned language models are zero-shot learners

J Wei, M Bosma, VY Zhao, K Guu, AW Yu… - arXiv preprint arXiv …, 2021 - arxiv.org

This paper explores a simple method for improving the zero-shot learning abilities of
language models. We show that instruction tuning--finetuning language models on a …

被引用次数：2770 相关文章所有 6 个版本

[PDF] arxiv.org

Metaicl: Learning to learn in context

S Min, M Lewis, L Zettlemoyer, H Hajishirzi - arXiv preprint arXiv …, 2021 - arxiv.org

We introduce MetaICL (Meta-training for In-Context Learning), a new meta-training
framework for few-shot learning where a pretrained language model is tuned to do in …

被引用次数：364 相关文章所有 5 个版本

[PDF] neurips.cc

Ties-merging: Resolving interference when merging models

P Yadav, D Tam, L Choshen… - Advances in Neural …, 2024 - proceedings.neurips.cc

Transfer learning–ie, further fine-tuning a pre-trained model on a downstream task–can
confer significant advantages, including improved downstream performance, faster …

被引用次数：100 相关文章所有 7 个版本

[PDF] arxiv.org

Weak-to-strong generalization: Eliciting strong capabilities with weak supervision

C Burns, P Izmailov, JH Kirchner, B Baker… - arXiv preprint arXiv …, 2023 - arxiv.org

Widely used alignment techniques, such as reinforcement learning from human feedback
(RLHF), rely on the ability of humans to supervise model behavior-for example, to evaluate …

被引用次数：115 相关文章所有 7 个版本

[PDF] arxiv.org

Cross-task generalization via natural language crowdsourcing instructions

S Mishra, D Khashabi, C Baral, H Hajishirzi - arXiv preprint arXiv …, 2021 - arxiv.org

Humans (eg, crowdworkers) have a remarkable ability in solving different tasks, by simply
reading textual instructions that define them and looking at a few examples. Despite the …

被引用次数：540 相关文章所有 6 个版本

[PDF] arxiv.org

Delta tuning: A comprehensive study of parameter efficient methods for pre-trained language models

N Ding, Y Qin, G Yang, F Wei, Z Yang, Y Su… - arXiv preprint arXiv …, 2022 - arxiv.org

Despite the success, the process of fine-tuning large-scale PLMs brings prohibitive
adaptation costs. In fact, fine-tuning all the parameters of a colossal model and retaining …

被引用次数：201 相关文章所有 6 个版本