Qa dataset explosion: A taxonomy of nlp resources for question answering and reading comprehension

A Rogers, M Gardner, I Augenstein - ACM Computing Surveys, 2023 - dl.acm.org
Alongside huge volumes of research on deep learning models in NLP in the recent years,
there has been much work on benchmark datasets needed to track modeling progress …

A survey of deep learning for mathematical reasoning

P Lu, L Qiu, W Yu, S Welleck, KW Chang - arXiv preprint arXiv:2212.10535, 2022 - arxiv.org
Mathematical reasoning is a fundamental aspect of human intelligence and is applicable in
various fields, including science, engineering, finance, and everyday life. The development …

Augmented language models: a survey

G Mialon, R Dessì, M Lomeli, C Nalmpantis… - arXiv preprint arXiv …, 2023 - arxiv.org
This survey reviews works in which language models (LMs) are augmented with reasoning
skills and the ability to use tools. The former is defined as decomposing a potentially …

Solving quantitative reasoning problems with language models

A Lewkowycz, A Andreassen… - Advances in …, 2022 - proceedings.neurips.cc
Abstract Language models have achieved remarkable performance on a wide range of
tasks that require natural language understanding. Nevertheless, state-of-the-art models …

Reasoning with language model prompting: A survey

S Qiao, Y Ou, N Zhang, X Chen, Y Yao, S Deng… - arXiv preprint arXiv …, 2022 - arxiv.org
Reasoning, as an essential ability for complex problem-solving, can provide back-end
support for various real-world applications, such as medical diagnosis, negotiation, etc. This …

Cross-task generalization via natural language crowdsourcing instructions

S Mishra, D Khashabi, C Baral, H Hajishirzi - arXiv preprint arXiv …, 2021 - arxiv.org
Humans (eg, crowdworkers) have a remarkable ability in solving different tasks, by simply
reading textual instructions that define them and looking at a few examples. Despite the …

Cheap and quick: Efficient vision-language instruction tuning for large language models

G Luo, Y Zhou, T Ren, S Chen… - Advances in Neural …, 2024 - proceedings.neurips.cc
Recently, growing interest has been aroused in extending the multimodal capability of large
language models (LLMs), eg, vision-language (VL) learning, which is regarded as the next …

How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained language model

M Hanna, O Liu, A Variengien - Advances in Neural …, 2024 - proceedings.neurips.cc
Pre-trained language models can be surprisingly adept at tasks they were not explicitly
trained on, but how they implement these capabilities is poorly understood. In this paper, we …

Mammoth: Building math generalist models through hybrid instruction tuning

X Yue, X Qu, G Zhang, Y Fu, W Huang, H Sun… - arXiv preprint arXiv …, 2023 - arxiv.org
We introduce MAmmoTH, a series of open-source large language models (LLMs)
specifically tailored for general math problem-solving. The MAmmoTH models are trained on …

Folio: Natural language reasoning with first-order logic

S Han, H Schoelkopf, Y Zhao, Z Qi, M Riddell… - arXiv preprint arXiv …, 2022 - arxiv.org
Large language models (LLMs) have achieved remarkable performance on a variety of
natural language understanding tasks. However, existing benchmarks are inadequate in …