Natural language reasoning, a survey

F Yu, H Zhang, P Tiwari, B Wang - ACM Computing Surveys, 2023 - dl.acm.org
This survey paper proposes a clearer view of natural language reasoning in the field of
Natural Language Processing (NLP), both conceptually and practically. Conceptually, we …

Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reasoning to Language Agents

Z Zhang, Y Yao, A Zhang, X Tang, X Ma, Z He… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs) have dramatically enhanced the field of language
intelligence, as demonstrably evidenced by their formidable empirical performance across a …

A survey of large language models

WX Zhao, K Zhou, J Li, T Tang, X Wang, Y Hou… - arXiv preprint arXiv …, 2023 - arxiv.org
Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …

Critic: Large language models can self-correct with tool-interactive critiquing

Z Gou, Z Shao, Y Gong, Y Shen, Y Yang… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent developments in large language models (LLMs) have been impressive. However,
these models sometimes show inconsistencies and problematic behavior, such as …

Math-shepherd: Verify and reinforce llms step-by-step without human annotations

P Wang, L Li, Z Shao, R Xu, D Dai, Y Li… - Proceedings of the …, 2024 - aclanthology.org
In this paper, we present an innovative process-oriented math process reward model called
Math-shepherd, which assigns a reward score to each step of math problem solutions. The …

Language agent tree search unifies reasoning acting and planning in language models

A Zhou, K Yan, M Shlapentokh-Rothman… - arXiv preprint arXiv …, 2023 - arxiv.org
While large language models (LLMs) have demonstrated impressive performance on a
range of decision-making tasks, they rely on simple acting processes and fall short of broad …

Branch-solve-merge improves large language model evaluation and generation

S Saha, O Levy, A Celikyilmaz, M Bansal… - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) are frequently used for multi-faceted language generation
and evaluation tasks that involve satisfying intricate user constraints or taking into account …

Holistic analysis of hallucination in gpt-4v (ision): Bias and interference challenges

C Cui, Y Zhou, X Yang, S Wu, L Zhang, J Zou… - arXiv preprint arXiv …, 2023 - arxiv.org
While GPT-4V (ision) impressively models both visual and textual information
simultaneously, it's hallucination behavior has not been systematically assessed. To bridge …

Adapt: As-needed decomposition and planning with language models

A Prasad, A Koller, M Hartmann, P Clark… - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) are increasingly being used for interactive decision-making
tasks requiring planning and adapting to the environment. Recent works employ LLMs-as …

Alphazero-like tree-search can guide large language model decoding and training

X Feng, Z Wan, M Wen, SM McAleer, Y Wen… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent works like Tree-of-Thought (ToT) and Reasoning via Planning (RAP) aim to augment
the reasoning capabilities of LLMs by using tree-search algorithms to guide multi-step …