Halueval: A large-scale hallucination evaluation benchmark for large language models

J Li, X Cheng, WX Zhao, JY Nie, JR Wen - arXiv preprint arXiv:2305.11747, 2023 - arxiv.org
Large language models (LLMs), such as ChatGPT, are prone to generate hallucinations, ie,
content that conflicts with the source or cannot be verified by the factual knowledge. To …

Mitigating large language model hallucinations via autonomous knowledge graph-based retrofitting

X Guan, Y Liu, H Lin, Y Lu, B He, X Han… - Proceedings of the AAAI …, 2024 - ojs.aaai.org
Incorporating factual knowledge in knowledge graph is regarded as a promising approach
for mitigating the hallucination of large language models (LLMs). Existing methods usually …

The dawn after the dark: An empirical study on factuality hallucination in large language models

J Li, J Chen, R Ren, X Cheng, WX Zhao, JY Nie… - arXiv preprint arXiv …, 2024 - arxiv.org
In the era of large language models (LLMs), hallucination (ie, the tendency to generate
factually incorrect content) poses great challenge to trustworthy and reliable deployment of …

Dragin: Dynamic retrieval augmented generation based on the real-time information needs of large language models

W Su, Y Tang, Q Ai, Z Wu, Y Liu - arXiv preprint arXiv:2403.10081, 2024 - arxiv.org
Dynamic retrieval augmented generation (RAG) paradigm actively decides when and what
to retrieve during the text generation process of Large Language Models (LLMs). There are …

Human vs ChatGPT: Effect of Data Annotation in Interpretable Crisis-Related Microblog Classification

TH Nguyen, K Rudra - Proceedings of the ACM on Web Conference …, 2024 - dl.acm.org
Recent studies have exploited the vital role of microblogging platforms, such as Twitter, in
crisis situations. Various machine-learning approaches have been proposed to identify and …

Mitigating Entity-Level Hallucination in Large Language Models

W Su, Y Tang, Q Ai, C Wang, Z Wu, Y Liu - arXiv preprint arXiv:2407.09417, 2024 - arxiv.org
The emergence of Large Language Models (LLMs) has revolutionized how users access
information, shifting from traditional search engines to direct question-and-answer …

REPOFORMER: Selective retrieval for repository-level code completion

D Wu, WU Ahmad, D Zhang, MK Ramanathan… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent advances in retrieval-augmented generation (RAG) have initiated a new era in
repository-level code completion. However, the invariable use of retrieval in existing …

Open-World Evaluation for Retrieving Diverse Perspectives

HT Chen, E Choi - arXiv preprint arXiv:2409.18110, 2024 - arxiv.org
We study retrieving a set of documents that covers various perspectives on a complex and
contentious question (eg, will ChatGPT do more harm than good?). We curate a Benchmark …

Ladder-of-thought: Using knowledge as steps to elevate stance detection

K Hu, M Yan, WH Chong, YK Yap… - … Joint Conference on …, 2024 - ieeexplore.ieee.org
Stance detection aims to determine the attitude or viewpoint expressed in a document
regarding a specific target. Recent advancements in Large Language Models (LLMs), such …

Preference-Guided Refactored Tuning for Retrieval Augmented Code Generation

X Gao, Y Xiong, D Wang, Z Guan, Z Shi… - arXiv preprint arXiv …, 2024 - arxiv.org
Retrieval-augmented code generation utilizes Large Language Models as the generator
and significantly expands their code generation capabilities by providing relevant code …