Training chain-of-thought via latent-variable inference

A Singh, JD Co-Reyes, R Agarwal, A Anand… - arXiv preprint arXiv …, 2023 - arxiv.org

Fine-tuning language models~(LMs) on human-generated data remains a prevalent
practice. However, the performance of such models is often limited by the quantity and …

被引用次数：44 相关文章所有 5 个版本

[PDF] arxiv.org

Amortizing intractable inference in large language models

EJ Hu, M Jain, E Elmoznino, Y Kaddar, G Lajoie… - arXiv preprint arXiv …, 2023 - arxiv.org

Autoregressive large language models (LLMs) compress knowledge from their training data
through next-token conditional distributions. This limits tractable querying of this knowledge …

被引用次数：21 相关文章所有 3 个版本

[PDF] arxiv.org

Star-gate: Teaching language models to ask clarifying questions

C Andukuri, JP Fränken, T Gerstenberg… - arXiv preprint arXiv …, 2024 - arxiv.org

When prompting language models to complete a task, users often leave important aspects
unsaid. While asking questions could resolve this ambiguity\citep [GATE;][]{li2023eliciting} …

被引用次数：11 相关文章所有 3 个版本

[PDF] arxiv.org

Quiet-star: Language models can teach themselves to think before speaking

E Zelikman, G Harik, Y Shao, V Jayasiri… - arXiv preprint arXiv …, 2024 - arxiv.org

When writing and talking, people sometimes pause to think. Although reasoning-focused
works have often framed reasoning as a method of answering questions or completing …

被引用次数：15 相关文章所有 2 个版本

[PDF] arxiv.org

Doing experiments and revising rules with natural language and probabilistic reasoning

T Piriyakulkij, K Ellis - arXiv preprint arXiv:2402.06025, 2024 - arxiv.org

We build a computational model of how humans actively infer hidden rules by doing
experiments. The basic principles behind the model is that, even if the rule is deterministic …

被引用次数：2 相关文章所有 5 个版本

[PDF] arxiv.org

NExT: Teaching Large Language Models to Reason about Code Execution

A Ni, M Allamanis, A Cohan, Y Deng, K Shi… - arXiv preprint arXiv …, 2024 - arxiv.org

A fundamental skill among human developers is the ability to understand and reason about
program execution. As an example, a programmer can mentally simulate code execution in …

被引用次数：3 相关文章所有 3 个版本

[PDF] arxiv.org