When Can LLMs Actually Correct Their Own Mistakes? A Critical Survey of Self-Correction of LLMs

R Kamoi, Y Zhang, N Zhang, J Han… - Transactions of the …, 2024 - direct.mit.edu
Self-correction is an approach to improving responses from large language models (LLMs)
by refining the responses using LLMs during inference. Prior work has proposed various self …

Natural language generation and understanding of big code for AI-assisted programming: A review

MF Wong, S Guo, CN Hang, SW Ho, CW Tan - Entropy, 2023 - mdpi.com
This paper provides a comprehensive review of the literature concerning the utilization of
Natural Language Processing (NLP) techniques, with a particular focus on transformer …

Self-refine: Iterative refinement with self-feedback

A Madaan, N Tandon, P Gupta… - Advances in …, 2024 - proceedings.neurips.cc
Like humans, large language models (LLMs) do not always generate the best output on their
first try. Motivated by how humans refine their written text, we introduce Self-Refine, an …

Reflexion: Language agents with verbal reinforcement learning

N Shinn, F Cassano, A Gopinath… - Advances in …, 2024 - proceedings.neurips.cc
Large language models (LLMs) have been increasingly used to interact with external
environments (eg, games, compilers, APIs) as goal-driven agents. However, it remains …

Codet5+: Open code large language models for code understanding and generation

Y Wang, H Le, AD Gotmare, NDQ Bui, J Li… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs) pretrained on vast source code have achieved prominent
progress in code intelligence. However, existing code LLMs have two main limitations in …

Critic: Large language models can self-correct with tool-interactive critiquing

Z Gou, Z Shao, Y Gong, Y Shen, Y Yang… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent developments in large language models (LLMs) have been impressive. However,
these models sometimes show inconsistencies and problematic behavior, such as …

Trustworthy LLMs: A survey and guideline for evaluating large language models' alignment

Y Liu, Y Yao, JF Ton, X Zhang, RGH Cheng… - arXiv preprint arXiv …, 2023 - arxiv.org
Ensuring alignment, which refers to making models behave in accordance with human
intentions [1, 2], has become a critical task before deploying large language models (LLMs) …

Codet: Code generation with generated tests

B Chen, F Zhang, A Nguyen, D Zan, Z Lin… - arXiv preprint arXiv …, 2022 - arxiv.org
The task of generating code solutions for a given programming problem can benefit from the
use of pre-trained language models such as Codex, which can produce multiple diverse …

Cognitive architectures for language agents

TR Sumers, S Yao, K Narasimhan… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent efforts have incorporated large language models (LLMs) with external resources (eg,
the Internet) or internal control flows (eg, prompt chaining) for tasks requiring grounding or …

Large language models meet nl2code: A survey

D Zan, B Chen, F Zhang, D Lu, B Wu, B Guan… - arXiv preprint arXiv …, 2022 - arxiv.org
The task of generating code from a natural language description, or NL2Code, is considered
a pressing and significant challenge in code intelligence. Thanks to the rapid development …