Survey of hallucination in natural language generation
Natural Language Generation (NLG) has improved exponentially in recent years thanks to
the development of sequence-to-sequence deep learning technologies such as Transformer …
the development of sequence-to-sequence deep learning technologies such as Transformer …
Towards trustworthy LLMs: a review on debiasing and dehallucinating in large language models
Z Lin, S Guan, W Zhang, H Zhang, Y Li… - Artificial Intelligence …, 2024 - Springer
Recently, large language models (LLMs) have attracted considerable attention due to their
remarkable capabilities. However, LLMs' generation of biased or hallucinatory content …
remarkable capabilities. However, LLMs' generation of biased or hallucinatory content …
Siren's song in the AI ocean: a survey on hallucination in large language models
While large language models (LLMs) have demonstrated remarkable capabilities across a
range of downstream tasks, a significant concern revolves around their propensity to exhibit …
range of downstream tasks, a significant concern revolves around their propensity to exhibit …
Selfcheckgpt: Zero-resource black-box hallucination detection for generative large language models
Generative Large Language Models (LLMs) such as GPT-3 are capable of generating highly
fluent responses to a wide variety of user prompts. However, LLMs are known to hallucinate …
fluent responses to a wide variety of user prompts. However, LLMs are known to hallucinate …
Beyond the imitation game: Quantifying and extrapolating the capabilities of language models
Language models demonstrate both quantitative improvement and new qualitative
capabilities with increasing scale. Despite their potentially transformative impact, these new …
capabilities with increasing scale. Despite their potentially transformative impact, these new …
Med-halt: Medical domain hallucination test for large language models
This research paper focuses on the challenges posed by hallucinations in large language
models (LLMs), particularly in the context of the medical domain. Hallucination, wherein …
models (LLMs), particularly in the context of the medical domain. Hallucination, wherein …
A culturally sensitive test to evaluate nuanced gpt hallucination
The Generative Pre-trained Transformer (GPT) models, renowned for generating human-like
text, occasionally produce “hallucinations”-outputs that diverge from human expectations …
text, occasionally produce “hallucinations”-outputs that diverge from human expectations …
Red teaming language model detectors with language models
The prevalence and strong capability of large language models (LLMs) present significant
safety and ethical risks if exploited by malicious users. To prevent the potentially deceptive …
safety and ethical risks if exploited by malicious users. To prevent the potentially deceptive …
CLIFF: Contrastive learning for improving faithfulness and factuality in abstractive summarization
We study generating abstractive summaries that are faithful and factually consistent with the
given articles. A novel contrastive learning formulation is presented, which leverages both …
given articles. A novel contrastive learning formulation is presented, which leverages both …
Generating benchmarks for factuality evaluation of language models
Before deploying a language model (LM) within a given domain, it is important to measure
its tendency to generate factually incorrect information in that domain. Existing factual …
its tendency to generate factually incorrect information in that domain. Existing factual …