Conditions for length generalization in learning reasoning skills

文章

学术资源搜索

获得 4 条结果（用时0.05秒）

我的图书馆

Conditions for length generalization in learning reasoning skills

在引用文章中搜索

[PDF] arxiv.org

Understanding the reasoning ability of language models from the perspective of reasoning paths aggregation

X Wang, A Amayuelas, K Zhang, L Pan, W Chen… - arXiv preprint arXiv …, 2024 - arxiv.org

Pre-trained language models (LMs) are able to perform complex reasoning without explicit
fine-tuning. To understand how pre-training with a next-token prediction objective …

被引用次数：3 相关文章所有 4 个版本

[PDF] arxiv.org

On provable length and compositional generalization

K Ahuja, A Mansouri - arXiv preprint arXiv:2402.04875, 2024 - arxiv.org

Length generalization--the ability to generalize to longer sequences than ones seen during
training, and compositional generalization--the ability to generalize to token combinations …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

Position Coupling: Leveraging Task Structure for Improved Length Generalization of Transformers

H Cho, J Cha, P Awasthi, S Bhojanapalli… - arXiv preprint arXiv …, 2024 - arxiv.org

Even for simple arithmetic tasks like integer addition, it is challenging for Transformers to
generalize to longer sequences than those encountered during training. To tackle this …

Your Context Is Not an Array: Unveiling Random Access Limitations in Transformers

MR Ebrahimi, S Panchal, R Memisevic - arXiv preprint arXiv:2408.05506, 2024 - arxiv.org

Despite their recent successes, Transformer-based large language models show surprising
failure modes. A well-known example of such failure modes is their inability to length …