A comprehensive overview of large language models

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …

A survey on data selection for language models

A Albalak, Y Elazar, SM Xie, S Longpre… - arXiv preprint arXiv …, 2024 - arxiv.org
A major factor in the recent success of large language models is the use of enormous and
ever-growing text datasets for unsupervised pre-training. However, naively training a model …

Metamath: Bootstrap your own mathematical questions for large language models

L Yu, W Jiang, H Shi, J Yu, Z Liu, Y Zhang… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs) have pushed the limits of natural language understanding
and exhibited excellent problem-solving ability. Despite the great success, most existing …

Wizardmath: Empowering mathematical reasoning for large language models via reinforced evol-instruct

H Luo, Q Sun, C Xu, P Zhao, J Lou, C Tao… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs), such as GPT-4, have shown remarkable performance in
natural language processing (NLP) tasks, including challenging mathematical reasoning …

Self-play fine-tuning converts weak language models to strong language models

Z Chen, Y Deng, H Yuan, K Ji, Q Gu - arXiv preprint arXiv:2401.01335, 2024 - arxiv.org
Harnessing the power of human-annotated data through Supervised Fine-Tuning (SFT) is
pivotal for advancing Large Language Models (LLMs). In this paper, we delve into the …

Math-shepherd: Verify and reinforce llms step-by-step without human annotations

P Wang, L Li, Z Shao, R Xu, D Dai, Y Li… - Proceedings of the …, 2024 - aclanthology.org
In this paper, we present an innovative process-oriented math process reward model called
Math-shepherd, which assigns a reward score to each step of math problem solutions. The …

Tora: A tool-integrated reasoning agent for mathematical problem solving

Z Gou, Z Shao, Y Gong, Y Shen, Y Yang… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models have made significant progress in various language tasks, yet they
still struggle with complex mathematics. In this paper, we propose ToRA a series of Tool …

Mammoth: Building math generalist models through hybrid instruction tuning

X Yue, X Qu, G Zhang, Y Fu, W Huang, H Sun… - arXiv preprint arXiv …, 2023 - arxiv.org
We introduce MAmmoTH, a series of open-source large language models (LLMs)
specifically tailored for general math problem-solving. The MAmmoTH models are trained on …

Alphazero-like tree-search can guide large language model decoding and training

X Feng, Z Wan, M Wen, SM McAleer, Y Wen… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent works like Tree-of-Thought (ToT) and Reasoning via Planning (RAP) aim to augment
the reasoning capabilities of LLMs by using tree-search algorithms to guide multi-step …

Instructerc: Reforming emotion recognition in conversation with a retrieval multi-task llms framework

S Lei, G Dong, X Wang, K Wang, S Wang - arXiv preprint arXiv …, 2023 - arxiv.org
The development of emotion recognition in dialogue (ERC) has been consistently hindered
by the complexity of pipeline designs, leading to ERC models that often overfit to specific …