Improving large language model fine-tuning for solving math problems

From google gemini to openai q*(q-star): A survey of reshaping the generative artificial intelligence (ai) research landscape

TR McIntosh, T Susnjak, T Liu, P Watters… - arXiv preprint arXiv …, 2023 - arxiv.org

This comprehensive survey explored the evolving landscape of generative Artificial
Intelligence (AI), with a specific focus on the transformative impacts of Mixture of Experts …

被引用次数：118 相关文章所有 3 个版本

[PDF] arxiv.org

Mathematical language models: A survey

W Liu, H Hu, J Zhou, Y Ding, J Li, J Zeng, M He… - arXiv preprint arXiv …, 2023 - arxiv.org

In recent years, there has been remarkable progress in leveraging Language Models (LMs),
encompassing Pre-trained Language Models (PLMs) and Large-scale Language Models …

被引用次数：11 相关文章所有 2 个版本

[PDF] arxiv.org

Generative verifiers: Reward modeling as next-token prediction

L Zhang, A Hosseini, H Bansal, M Kazemi… - arXiv preprint arXiv …, 2024 - arxiv.org

Verifiers or reward models are often used to enhance the reasoning performance of large
language models (LLMs). A common approach is the Best-of-N method, where N candidate …

被引用次数：35 相关文章所有 4 个版本

[PDF] unimore.it

Safe-clip: Removing nsfw concepts from vision-and-language models

S Poppi, T Poppi, F Cocchi, M Cornia, L Baraldi… - … on Computer Vision, 2025 - Springer

Large-scale vision-and-language models, such as CLIP, are typically trained on web-scale
data, which can introduce inappropriate content and lead to the development of unsafe and …

被引用次数：13 相关文章

[PDF] arxiv.org

Large language models for mathematical reasoning: Progresses and challenges

J Ahn, R Verma, R Lou, D Liu, R Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org

Mathematical reasoning serves as a cornerstone for assessing the fundamental cognitive
capabilities of human intelligence. In recent times, there has been a notable surge in the …

被引用次数：112 相关文章所有 4 个版本

[PDF] arxiv.org

Step-dpo: Step-wise preference optimization for long-chain reasoning of llms

X Lai, Z Tian, Y Chen, S Yang, X Peng, J Jia - arXiv preprint arXiv …, 2024 - arxiv.org

Mathematical reasoning presents a significant challenge for Large Language Models
(LLMs) due to the extensive and precise chain of reasoning required for accuracy. Ensuring …

被引用次数：23 相关文章所有 3 个版本

[PDF] aclanthology.org

Predicting text preference via structured comparative reasoning

JN Yan, T Liu, J Chiu, J Shen, Z Qin, Y Yu… - Proceedings of the …, 2024 - aclanthology.org

Comparative reasoning plays a crucial role in predicting text preferences; however, large
language models (LLMs) often demonstrate inconsistencies in their reasoning, leading to …

被引用次数：6 相关文章所有 2 个版本

Visual agents as fast and slow thinkers

G Sun, M Jin, Z Wang, CL Wang, S Ma, Q Wang… - arXiv preprint arXiv …, 2024 - arxiv.org

Achieving human-level intelligence requires refining cognitive distinctions between System
1 and System 2 thinking. While contemporary AI, driven by large language models …

被引用次数：6 相关文章所有 3 个版本

[PDF] openreview.net

Fight back against jailbreaking via prompt adversarial tuning

Y Mo, Y Wang, Z Wei, Y Wang - The Thirty-eighth Annual …, 2024 - openreview.net

While Large Language Models (LLMs) have achieved tremendous success in various
applications, they are also susceptible to jailbreaking attacks. Several primary defense …

被引用次数：4 相关文章

[PDF] arxiv.org

Embedding self-correction as an inherent ability in large language models for enhanced mathematical reasoning

K Gao, H Cai, Q Shuai, D Gong, Z Li - arXiv preprint arXiv:2410.10735, 2024 - arxiv.org

Accurate mathematical reasoning with Large Language Models (LLMs) is crucial in
revolutionizing domains that heavily rely on such reasoning. However, LLMs often …

被引用次数：2 相关文章所有 2 个版本