Large language models for mathematical reasoning: Progresses and challenges

J Ahn, R Verma, R Lou, D Liu, R Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
Mathematical reasoning serves as a cornerstone for assessing the fundamental cognitive
capabilities of human intelligence. In recent times, there has been a notable surge in the …

Let gpt be a math tutor: Teaching math word problem solvers with customized exercise generation

Z Liang, W Yu, T Rajpurohit, P Clark, X Zhang… - arXiv preprint arXiv …, 2023 - arxiv.org
In this paper, we present a novel approach for distilling math word problem solving
capabilities from large language models (LLMs) into smaller, more efficient student models …

Unimath: A foundational and multimodal mathematical reasoner

Z Liang, T Yang, J Zhang, X Zhang - Proceedings of the 2023 …, 2023 - aclanthology.org
While significant progress has been made in natural language processing (NLP), existing
methods exhibit limitations in effectively interpreting and processing diverse mathematical …

Auto-instruct: Automatic instruction generation and ranking for black-box language models

Z Zhang, S Wang, W Yu, Y Xu, D Iter, Q Zeng… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs) can perform a wide range of tasks by following natural
language instructions, without the necessity of task-specific fine-tuning. Unfortunately, the …

How numerical precision affects mathematical reasoning capabilities of llms

G Feng, K Yang, Y Gu, X Ai, S Luo, J Sun, D He… - arXiv preprint arXiv …, 2024 - arxiv.org
Despite the remarkable success of Transformer-based Large Language Models (LLMs)
across various domains, understanding and enhancing their mathematical capabilities …

Teaching-Assistant-in-the-Loop: Improving Knowledge Distillation from Imperfect Teacher Models in Low-Budget Scenarios

Y Zhou, W Ai - arXiv preprint arXiv:2406.05322, 2024 - arxiv.org
There is increasing interest in distilling task-specific knowledge from large language models
(LLM) to smaller student models. Nonetheless, LLM distillation presents a dual challenge: 1) …

MathChat: Benchmarking Mathematical Reasoning and Instruction Following in Multi-Turn Interactions

Z Liang, D Yu, W Yu, W Yao, Z Zhang, X Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language models (LLMs) have demonstrated impressive capabilities in mathematical
problem solving, particularly in single turn question answering formats. However, real world …

Siam: Self-improving code-assisted mathematical reasoning of large language models

D Yu, B Peng, Y Tian, L Song, H Mi, D Yu - arXiv preprint arXiv …, 2024 - arxiv.org
There is a growing trend of teaching large language models (LLMs) to solve mathematical
problems through coding. Existing studies primarily focus on prompting powerful, closed …

A Survey of Mathematical Reasoning in the Era of Multimodal Large Language Model: Benchmark, Method & Challenges

Y Yan, J Su, J He, F Fu, X Zheng, Y Lyu… - arXiv preprint arXiv …, 2024 - arxiv.org
Mathematical reasoning, a core aspect of human cognition, is vital across many domains,
from educational problem-solving to scientific advancements. As artificial general …

Towards A Unified View of Answer Calibration for Multi-Step Reasoning

S Deng, N Zhang, N Oo, B Hooi - arXiv preprint arXiv:2311.09101, 2023 - arxiv.org
Large Language Models (LLMs) employing Chain-of-Thought (CoT) prompting have
broadened the scope for improving multi-step reasoning capabilities. Usually, answer …