Language model behavior: A comprehensive survey
Transformer language models have received widespread public attention, yet their
generated text is often surprising even to NLP researchers. In this survey, we discuss over …
generated text is often surprising even to NLP researchers. In this survey, we discuss over …
Chatgpt is not enough: Enhancing large language models with knowledge graphs for fact-aware language modeling
Recently, ChatGPT, a representative large language model (LLM), has gained considerable
attention due to its powerful emergent abilities. Some researchers suggest that LLMs could …
attention due to its powerful emergent abilities. Some researchers suggest that LLMs could …
Pythia: A suite for analyzing large language models across training and scaling
S Biderman, H Schoelkopf… - International …, 2023 - proceedings.mlr.press
How do large language models (LLMs) develop and evolve over the course of training?
How do these patterns change as models scale? To answer these questions, we introduce …
How do these patterns change as models scale? To answer these questions, we introduce …
Large language models struggle to learn long-tail knowledge
The Internet contains a wealth of knowledge—from the birthdays of historical figures to
tutorials on how to code—all of which may be learned by language models. However, while …
tutorials on how to code—all of which may be learned by language models. However, while …
Impact of pretraining term frequencies on few-shot reasoning
Pretrained Language Models (LMs) have demonstrated ability to perform numerical
reasoning by extrapolating from a few examples in few-shot settings. However, the extent to …
reasoning by extrapolating from a few examples in few-shot settings. However, the extent to …
Trustworthy LLMs: A survey and guideline for evaluating large language models' alignment
Ensuring alignment, which refers to making models behave in accordance with human
intentions [1, 2], has become a critical task before deploying large language models (LLMs) …
intentions [1, 2], has become a critical task before deploying large language models (LLMs) …
Interpretability at scale: Identifying causal mechanisms in alpaca
Obtaining human-interpretable explanations of large, general-purpose language models is
an urgent goal for AI safety. However, it is just as important that our interpretability methods …
an urgent goal for AI safety. However, it is just as important that our interpretability methods …
Speak, memory: An archaeology of books known to chatgpt/gpt-4
In this work, we carry out a data archaeology to infer books that are known to ChatGPT and
GPT-4 using a name cloze membership inference query. We find that OpenAI models have …
GPT-4 using a name cloze membership inference query. We find that OpenAI models have …
Counterfactual memorization in neural language models
Modern neural language models that are widely used in various NLP tasks risk memorizing
sensitive information from their training data. Understanding this memorization is important …
sensitive information from their training data. Understanding this memorization is important …
Embers of autoregression: Understanding large language models through the problem they are trained to solve
The widespread adoption of large language models (LLMs) makes it important to recognize
their strengths and limitations. We argue that in order to develop a holistic understanding of …
their strengths and limitations. We argue that in order to develop a holistic understanding of …