Challenges and applications of large language models
Large Language Models (LLMs) went from non-existent to ubiquitous in the machine
learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify …
learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify …
Language model behavior: A comprehensive survey
Transformer language models have received widespread public attention, yet their
generated text is often surprising even to NLP researchers. In this survey, we discuss over …
generated text is often surprising even to NLP researchers. In this survey, we discuss over …
Harnessing the power of llms in practice: A survey on chatgpt and beyond
This article presents a comprehensive and practical guide for practitioners and end-users
working with Large Language Models (LLMs) in their downstream Natural Language …
working with Large Language Models (LLMs) in their downstream Natural Language …
On second thought, let's not think step by step! Bias and toxicity in zero-shot reasoning
Generating a Chain of Thought (CoT) has been shown to consistently improve large
language model (LLM) performance on a wide range of NLP tasks. However, prior work has …
language model (LLM) performance on a wide range of NLP tasks. However, prior work has …
Revisiting out-of-distribution robustness in nlp: Benchmarks, analysis, and LLMs evaluations
This paper reexamines the research on out-of-distribution (OOD) robustness in the field of
NLP. We find that the distribution shift settings in previous studies commonly lack adequate …
NLP. We find that the distribution shift settings in previous studies commonly lack adequate …
The cot collection: Improving zero-shot and few-shot learning of language models via chain-of-thought fine-tuning
Language models (LMs) with less than 100B parameters are known to perform poorly on
chain-of-thought (CoT) reasoning in contrast to large LMs when solving unseen tasks. In this …
chain-of-thought (CoT) reasoning in contrast to large LMs when solving unseen tasks. In this …
Foundational challenges in assuring alignment and safety of large language models
This work identifies 18 foundational challenges in assuring the alignment and safety of large
language models (LLMs). These challenges are organized into three different categories …
language models (LLMs). These challenges are organized into three different categories …
Inverse scaling: When bigger isn't better
Work on scaling laws has found that large language models (LMs) show predictable
improvements to overall loss with increased scale (model size, training data, and compute) …
improvements to overall loss with increased scale (model size, training data, and compute) …
Instruction-following evaluation through verbalizer manipulation
While instruction-tuned models have shown remarkable success in various natural
language processing tasks, accurately evaluating their ability to follow instructions remains …
language processing tasks, accurately evaluating their ability to follow instructions remains …
Language models are not naysayers: an analysis of language models on negation benchmarks
Negation has been shown to be a major bottleneck for masked language models, such as
BERT. However, whether this finding still holds for larger-sized auto-regressive language …
BERT. However, whether this finding still holds for larger-sized auto-regressive language …