Large language models for data annotation: A survey

Z Tan, D Li, S Wang, A Beigi, B Jiang… - arXiv preprint arXiv …, 2024 - arxiv.org
Data annotation generally refers to the labeling or generating of raw data with relevant
information, which could be used for improving the efficacy of machine learning models. The …

Think twice before trusting: Self-detection for large language models through comprehensive answer reflection

M Li, W Wang, F Feng, F Zhu, Q Wang… - Findings of the …, 2024 - aclanthology.org
Abstract Self-detection for Large Language Models (LLMs) seeks to evaluate the
trustworthiness of the LLM's output by leveraging its own capabilities, thereby alleviating the …

Artifacts or Abduction: How Do LLMs Answer Multiple-Choice Questions Without the Question?

N Balepur, A Ravichander, R Rudinger - arXiv preprint arXiv:2402.12483, 2024 - arxiv.org
Multiple-choice question answering (MCQA) is often used to evaluate large language
models (LLMs). To see if MCQA assesses LLMs as intended, we probe if LLMs can perform …

Is Your Large Language Model Knowledgeable or a Choices-Only Cheater?

N Balepur, R Rudinger - arXiv preprint arXiv:2407.01992, 2024 - arxiv.org
Recent work shows that large language models (LLMs) can answer multiple-choice
questions using only the choices, but does this mean that MCQA leaderboard rankings of …

Plausibly Problematic Questions in Multiple-Choice Benchmarks for Commonsense Reasoning

S Palta, N Balepur, P Rankel, S Wiegreffe… - arXiv preprint arXiv …, 2024 - arxiv.org
Questions involving commonsense reasoning about everyday situations often admit many
$\textit {possible} $ or $\textit {plausible} $ answers. In contrast, multiple-choice question …

Reverse Question Answering: Can an LLM Write a Question so Hard (or Bad) that it Can't Answer?

N Balepur, F Gu, A Ravichander, S Feng… - arXiv preprint arXiv …, 2024 - arxiv.org
Question answering (QA)-producing correct answers for input questions-is popular, but we
test a reverse question answering (RQA) task: given an input answer, generate a question …

Counterfactual Debating with Preset Stances for Hallucination Elimination of LLMs

Y Fang, M Li, W Wang, H Lin, F Feng - arXiv preprint arXiv:2406.11514, 2024 - arxiv.org
Large Language Models (LLMs) excel in various natural language processing tasks but
struggle with hallucination issues. Existing solutions have considered utilizing LLMs' …