Factscore: Fine-grained atomic evaluation of factual precision in long form text generation
Evaluating the factuality of long-form text generated by large language models (LMs) is non-
trivial because (1) generations often contain a mixture of supported and unsupported pieces …
trivial because (1) generations often contain a mixture of supported and unsupported pieces …
A survey on automated fact-checking
Fact-checking has become increasingly important due to the speed with which both
information and misinformation can spread in the modern media ecosystem. Therefore …
information and misinformation can spread in the modern media ecosystem. Therefore …
Self-critiquing models for assisting human evaluators
We fine-tune large language models to write natural language critiques (natural language
critical comments) using behavioral cloning. On a topic-based summarization task, critiques …
critical comments) using behavioral cloning. On a topic-based summarization task, critiques …
Rarr: Researching and revising what language models say, using language models
Language models (LMs) now excel at many tasks such as few-shot learning, question
answering, reasoning, and dialog. However, they sometimes generate unsupported or …
answering, reasoning, and dialog. However, they sometimes generate unsupported or …
Internet-augmented language models through few-shot prompting for open-domain question answering
In this work, we aim to capitalize on the unique few-shot capabilities of large-scale language
models (LSLMs) to overcome some of their challenges with respect to grounding to factual …
models (LSLMs) to overcome some of their challenges with respect to grounding to factual …
Recursively summarizing books with human feedback
A major challenge for scaling machine learning is training models to perform tasks that are
very difficult or time-consuming for humans to evaluate. We present progress on this …
very difficult or time-consuming for humans to evaluate. We present progress on this …
Automated fact-checking for assisting human fact-checkers
The reporting and the analysis of current events around the globe has expanded from
professional, editor-lead journalism all the way to citizen journalism. Nowadays, politicians …
professional, editor-lead journalism all the way to citizen journalism. Nowadays, politicians …
LongEval: Guidelines for human evaluation of faithfulness in long-form summarization
While human evaluation remains best practice for accurately judging the faithfulness of
automatically-generated summaries, few solutions exist to address the increased difficulty …
automatically-generated summaries, few solutions exist to address the increased difficulty …
The state of human-centered NLP technology for fact-checking
Misinformation threatens modern society by promoting distrust in science, changing
narratives in public health, heightening social polarization, and disrupting democratic …
narratives in public health, heightening social polarization, and disrupting democratic …
Evidence-based fact-checking of health-related claims
The task of verifying the truthfulness of claims in textual documents, or fact-checking, has
received significant attention in recent years. Many existing evidence-based factchecking …
received significant attention in recent years. Many existing evidence-based factchecking …