Survey of hallucination in natural language generation

Z Ji, N Lee, R Frieske, T Yu, D Su, Y Xu, E Ishii… - ACM Computing …, 2023 - dl.acm.org
Natural Language Generation (NLG) has improved exponentially in recent years thanks to
the development of sequence-to-sequence deep learning technologies such as Transformer …

Towards trustworthy LLMs: a review on debiasing and dehallucinating in large language models

Z Lin, S Guan, W Zhang, H Zhang, Y Li… - Artificial Intelligence …, 2024 - Springer
Recently, large language models (LLMs) have attracted considerable attention due to their
remarkable capabilities. However, LLMs' generation of biased or hallucinatory content …

A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions

L Huang, W Yu, W Ma, W Zhong, Z Feng… - ACM Transactions on …, 2023 - dl.acm.org
The emergence of large language models (LLMs) has marked a significant breakthrough in
natural language processing (NLP), fueling a paradigm shift in information acquisition …

Biases in large language models: origins, inventory, and discussion

R Navigli, S Conia, B Ross - ACM Journal of Data and Information …, 2023 - dl.acm.org
In this article, we introduce and discuss the pervasive issue of bias in the large language
models that are currently at the core of mainstream approaches to Natural Language …

Chatgpt as a factual inconsistency evaluator for text summarization

Z Luo, Q Xie, S Ananiadou - arXiv preprint arXiv:2303.15621, 2023 - arxiv.org
The performance of text summarization has been greatly boosted by pre-trained language
models. A main concern of existing methods is that most generated summaries are not …

Hallucination is inevitable: An innate limitation of large language models

Z Xu, S Jain, M Kankanhalli - arXiv preprint arXiv:2401.11817, 2024 - arxiv.org
Hallucination has been widely recognized to be a significant drawback for large language
models (LLMs). There have been many works that attempt to reduce the extent of …

AlignScore: Evaluating factual consistency with a unified alignment function

Y Zha, Y Yang, R Li, Z Hu - arXiv preprint arXiv:2305.16739, 2023 - arxiv.org
Many text generation applications require the generated text to be factually consistent with
input information. Automatic evaluation of factual consistency is challenging. Previous work …

Logiqa 2.0—an improved dataset for logical reasoning in natural language understanding

H Liu, J Liu, L Cui, Z Teng, N Duan… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org
NLP research on logical reasoning regains momentum with the recent releases of a handful
of datasets, notably LogiQA and Reclor. Logical reasoning is exploited in many probing …

Trueteacher: Learning factual consistency evaluation with large language models

Z Gekhman, J Herzig, R Aharoni, C Elkind… - arXiv preprint arXiv …, 2023 - arxiv.org
Factual consistency evaluation is often conducted using Natural Language Inference (NLI)
models, yet these models exhibit limited success in evaluating summaries. Previous work …

Faithfulness in natural language generation: A systematic survey of analysis, evaluation and optimization methods

W Li, W Wu, M Chen, J Liu, X Xiao, H Wu - arXiv preprint arXiv:2203.05227, 2022 - arxiv.org
Natural Language Generation (NLG) has made great progress in recent years due to the
development of deep learning techniques such as pre-trained language models. This …