Don't Blame the Annotator: Bias Already Starts in the Annotation Instructions

P Fernandes, A Madaan, E Liu, A Farinhas… - Transactions of the …, 2023 - direct.mit.edu

Natural language generation has witnessed significant advancements due to the training of
large language models on vast internet-scale datasets. Despite these advancements, there …

被引用次数：72 相关文章所有 9 个版本

[PDF] arxiv.org

Evaluating human-language model interaction

M Lee, M Srivastava, A Hardy, J Thickstun… - arXiv preprint arXiv …, 2022 - arxiv.org

Many real-world applications of language models (LMs), such as writing assistance and
code autocomplete, involve human-LM interaction. However, most benchmarks are non …

被引用次数：104 相关文章所有 4 个版本

[PDF] arxiv.org

Take a step back: Evoking reasoning via abstraction in large language models

HS Zheng, S Mishra, X Chen, HT Cheng… - arXiv preprint arXiv …, 2023 - arxiv.org

We present Step-Back Prompting, a simple prompting technique that enables LLMs to do
abstractions to derive high-level concepts and first principles from instances containing …

被引用次数：68 相关文章所有 3 个版本

[PDF] arxiv.org

Uncertainty in natural language generation: From theory to applications

J Baan, N Daheim, E Ilia, D Ulmer, HS Li… - arXiv preprint arXiv …, 2023 - arxiv.org

Recent advances of powerful Language Models have allowed Natural Language
Generation (NLG) to emerge as an important technology that can not only perform traditional …

被引用次数：29 相关文章所有 2 个版本

[PDF] arxiv.org

Fairness in language models beyond English: Gaps and challenges

K Ramesh, S Sitaram, M Choudhury - arXiv preprint arXiv:2302.12578, 2023 - arxiv.org

With language models becoming increasingly ubiquitous, it has become essential to
address their inequitable treatment of diverse demographic groups and factors. Most …

被引用次数：39 相关文章所有 3 个版本

[PDF] aclanthology.org

Bioreader: a retrieval-enhanced text-to-text transformer for biomedical literature

G Frisoni, M Mizutani, G Moro… - Proceedings of the 2022 …, 2022 - aclanthology.org

The latest batch of research has equipped language models with the ability to attend over
relevant and factual information from non-parametric external sources, drawing a …

被引用次数：30 相关文章所有 2 个版本

[PDF] arxiv.org

A comprehensive survey on instruction following

R Lou, K Zhang, W Yin - arXiv preprint arXiv:2303.10475, 2023 - arxiv.org

Task semantics can be expressed by a set of input-output examples or a piece of textual
instruction. Conventional machine learning approaches for natural language processing …

被引用次数：9 相关文章所有 2 个版本

[PDF] acm.org

Never-ending learning of user interfaces

J Wu, R Krosnick, E Schoop, A Swearngin… - Proceedings of the 36th …, 2023 - dl.acm.org

Machine learning models have been trained to predict semantic information about user
interfaces (UIs) to make apps more accessible, easier to test, and to automate. Currently …

被引用次数：18 相关文章所有 9 个版本

[PDF] mit.edu

Analyzing dataset annotation quality management in the wild

JC Klie, RE Castilho, I Gurevych - Computational Linguistics, 2024 - direct.mit.edu

Data quality is crucial for training accurate, unbiased, and trustworthy machine learning
models as well as for their correct evaluation. Recent work, however, has shown that even …

被引用次数：15 相关文章所有 4 个版本

[PDF] arxiv.org

Targen: Targeted data generation with large language models

H Gupta, K Scaria, U Anantheswaran, S Verma… - arXiv preprint arXiv …, 2023 - arxiv.org

The rapid advancement of large language models (LLMs) has sparked interest in data
synthesis techniques, aiming to generate diverse and high-quality synthetic datasets …

被引用次数：17 相关文章所有 2 个版本