Scaling relationship on learning mathematical reasoning with large language models

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - arXiv preprint arXiv …, 2023 - arxiv.org

Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …

被引用次数：387 相关文章所有 3 个版本

[PDF] arxiv.org

A survey on data selection for language models

A Albalak, Y Elazar, SM Xie, S Longpre… - arXiv preprint arXiv …, 2024 - arxiv.org

A major factor in the recent success of large language models is the use of enormous and
ever-growing text datasets for unsupervised pre-training. However, naively training a model …

被引用次数：38 相关文章所有 2 个版本

[PDF] arxiv.org

Metamath: Bootstrap your own mathematical questions for large language models

L Yu, W Jiang, H Shi, J Yu, Z Liu, Y Zhang… - arXiv preprint arXiv …, 2023 - arxiv.org

Large language models (LLMs) have pushed the limits of natural language understanding
and exhibited excellent problem-solving ability. Despite the great success, most existing …

被引用次数：249 相关文章所有 5 个版本

[PDF] arxiv.org

Wizardmath: Empowering mathematical reasoning for large language models via reinforced evol-instruct

H Luo, Q Sun, C Xu, P Zhao, J Lou, C Tao… - arXiv preprint arXiv …, 2023 - arxiv.org

Large language models (LLMs), such as GPT-4, have shown remarkable performance in
natural language processing (NLP) tasks, including challenging mathematical reasoning …

被引用次数：219 相关文章所有 2 个版本

[PDF] arxiv.org

Self-play fine-tuning converts weak language models to strong language models

Z Chen, Y Deng, H Yuan, K Ji, Q Gu - arXiv preprint arXiv:2401.01335, 2024 - arxiv.org

Harnessing the power of human-annotated data through Supervised Fine-Tuning (SFT) is
pivotal for advancing Large Language Models (LLMs). In this paper, we delve into the …

被引用次数：129 相关文章所有 3 个版本

[PDF] aclanthology.org

Math-shepherd: Verify and reinforce llms step-by-step without human annotations

P Wang, L Li, Z Shao, R Xu, D Dai, Y Li… - Proceedings of the …, 2024 - aclanthology.org

In this paper, we present an innovative process-oriented math process reward model called
Math-shepherd, which assigns a reward score to each step of math problem solutions. The …

被引用次数：40 相关文章

[PDF] arxiv.org

Tora: A tool-integrated reasoning agent for mathematical problem solving

Z Gou, Z Shao, Y Gong, Y Shen, Y Yang… - arXiv preprint arXiv …, 2023 - arxiv.org

Large language models have made significant progress in various language tasks, yet they
still struggle with complex mathematics. In this paper, we propose ToRA a series of Tool …

被引用次数：84 相关文章所有 3 个版本

[PDF] openreview.net

Mammoth: Building math generalist models through hybrid instruction tuning

X Yue, X Qu, G Zhang, Y Fu, W Huang, H Sun… - arXiv preprint arXiv …, 2023 - arxiv.org

We introduce MAmmoTH, a series of open-source large language models (LLMs)
specifically tailored for general math problem-solving. The MAmmoTH models are trained on …

被引用次数：183 相关文章所有 3 个版本

[PDF] arxiv.org

Alphazero-like tree-search can guide large language model decoding and training

X Feng, Z Wan, M Wen, SM McAleer, Y Wen… - arXiv preprint arXiv …, 2023 - arxiv.org

Recent works like Tree-of-Thought (ToT) and Reasoning via Planning (RAP) aim to augment
the reasoning capabilities of LLMs by using tree-search algorithms to guide multi-step …

被引用次数：36 相关文章所有 5 个版本

[PDF] arxiv.org

Instructerc: Reforming emotion recognition in conversation with a retrieval multi-task llms framework

S Lei, G Dong, X Wang, K Wang, S Wang - arXiv preprint arXiv …, 2023 - arxiv.org

The development of emotion recognition in dialogue (ERC) has been consistently hindered
by the complexity of pipeline designs, leading to ERC models that often overfit to specific …

被引用次数：55 相关文章所有 2 个版本