Internal consistency and self-feedback in large language models: A survey

X Liang, S Song, Z Zheng, H Wang, Q Yu, X Li… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language models (LLMs) are expected to respond accurately but often exhibit
deficient reasoning or generate hallucinatory content. To address these, studies prefixed …

LiteSearch: Efficacious Tree Search for LLM

A Wang, L Song, Y Tian, B Peng, D Yu, H Mi… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent research suggests that tree search algorithms (eg Monte Carlo Tree Search) can
dramatically boost LLM performance on complex mathematical reasoning tasks. However …

Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning

X Wang, L Song, Y Tian, D Yu, B Peng, H Mi… - arXiv preprint arXiv …, 2024 - arxiv.org
Monte Carlo Tree Search (MCTS) has recently emerged as a powerful technique for
enhancing the reasoning capabilities of LLMs. Techniques such as SFT or DPO have …

[PDF][PDF] Optimizing Task Planning Efficiency in LLMs: Beyond Closed-Loop Systems

L Liu, A Nair, T Peng, S Desai, M Gupta… - Authorea …, 2024 - researchgate.net
Large language models (LLMs) have shown great promise in task execution, but traditional
closed-loop systems limit their planning efficiency. Addressing this challenge, we introduce …