Large language models for data annotation: A survey
Data annotation generally refers to the labeling or generating of raw data with relevant
information, which could be used for improving the efficacy of machine learning models. The …
information, which could be used for improving the efficacy of machine learning models. The …
Think twice before trusting: Self-detection for large language models through comprehensive answer reflection
Abstract Self-detection for Large Language Models (LLMs) seeks to evaluate the
trustworthiness of the LLM's output by leveraging its own capabilities, thereby alleviating the …
trustworthiness of the LLM's output by leveraging its own capabilities, thereby alleviating the …
Artifacts or Abduction: How Do LLMs Answer Multiple-Choice Questions Without the Question?
Multiple-choice question answering (MCQA) is often used to evaluate large language
models (LLMs). To see if MCQA assesses LLMs as intended, we probe if LLMs can perform …
models (LLMs). To see if MCQA assesses LLMs as intended, we probe if LLMs can perform …
Is Your Large Language Model Knowledgeable or a Choices-Only Cheater?
N Balepur, R Rudinger - arXiv preprint arXiv:2407.01992, 2024 - arxiv.org
Recent work shows that large language models (LLMs) can answer multiple-choice
questions using only the choices, but does this mean that MCQA leaderboard rankings of …
questions using only the choices, but does this mean that MCQA leaderboard rankings of …
Plausibly Problematic Questions in Multiple-Choice Benchmarks for Commonsense Reasoning
Questions involving commonsense reasoning about everyday situations often admit many
$\textit {possible} $ or $\textit {plausible} $ answers. In contrast, multiple-choice question …
$\textit {possible} $ or $\textit {plausible} $ answers. In contrast, multiple-choice question …
Reverse Question Answering: Can an LLM Write a Question so Hard (or Bad) that it Can't Answer?
Question answering (QA)-producing correct answers for input questions-is popular, but we
test a reverse question answering (RQA) task: given an input answer, generate a question …
test a reverse question answering (RQA) task: given an input answer, generate a question …
Counterfactual Debating with Preset Stances for Hallucination Elimination of LLMs
Large Language Models (LLMs) excel in various natural language processing tasks but
struggle with hallucination issues. Existing solutions have considered utilizing LLMs' …
struggle with hallucination issues. Existing solutions have considered utilizing LLMs' …