[HTML][HTML] AI deception: A survey of examples, risks, and potential solutions
This paper argues that a range of current AI systems have learned how to deceive humans.
We define deception as the systematic inducement of false beliefs in the pursuit of some …
We define deception as the systematic inducement of false beliefs in the pursuit of some …
Siren's song in the AI ocean: a survey on hallucination in large language models
While large language models (LLMs) have demonstrated remarkable capabilities across a
range of downstream tasks, a significant concern revolves around their propensity to exhibit …
range of downstream tasks, a significant concern revolves around their propensity to exhibit …
Explainability for large language models: A survey
Large language models (LLMs) have demonstrated impressive capabilities in natural
language processing. However, their internal mechanisms are still unclear and this lack of …
language processing. However, their internal mechanisms are still unclear and this lack of …
Xstest: A test suite for identifying exaggerated safety behaviours in large language models
Without proper safeguards, large language models will readily follow malicious instructions
and generate toxic content. This risk motivates safety efforts such as red-teaming and large …
and generate toxic content. This risk motivates safety efforts such as red-teaming and large …
How to catch an ai liar: Lie detection in black-box llms by asking unrelated questions
Large language models (LLMs) can" lie", which we define as outputting false statements
despite" knowing" the truth in a demonstrable sense. LLMs might" lie", for example, when …
despite" knowing" the truth in a demonstrable sense. LLMs might" lie", for example, when …
Algorithm of thoughts: Enhancing exploration of ideas in large language models
Current literature, aiming to surpass the" Chain-of-Thought" approach, often resorts to
external modi operandi involving halting, modifying, and then resuming the generation …
external modi operandi involving halting, modifying, and then resuming the generation …
Towards logiglue: A brief survey and a benchmark for analyzing logical reasoning capabilities of language models
Logical reasoning is fundamental for humans yet presents a substantial challenge in the
domain of Artificial Intelligence. Initially, researchers used Knowledge Representation and …
domain of Artificial Intelligence. Initially, researchers used Knowledge Representation and …
Automatically Correcting Large Language Models: Surveying the Landscape of Diverse Automated Correction Strategies
While large language models (LLMs) have shown remarkable effectiveness in various NLP
tasks, they are still prone to issues such as hallucination, unfaithful reasoning, and toxicity. A …
tasks, they are still prone to issues such as hallucination, unfaithful reasoning, and toxicity. A …
Measuring and improving chain-of-thought reasoning in vision-language models
Vision-language models (VLMs) have recently demonstrated strong efficacy as visual
assistants that can parse natural queries about the visual content and generate human-like …
assistants that can parse natural queries about the visual content and generate human-like …
Evaluating language model agency through negotiations
Companies, organizations, and governments increasingly exploit Language Models'(LM)
remarkable capability to display agent-like behavior. As LMs are adopted to perform tasks …
remarkable capability to display agent-like behavior. As LMs are adopted to perform tasks …