Qa dataset explosion: A taxonomy of nlp resources for question answering and reading comprehension

A Rogers, M Gardner, I Augenstein - ACM Computing Surveys, 2023 - dl.acm.org
Alongside huge volumes of research on deep learning models in NLP in the recent years,
there has been much work on benchmark datasets needed to track modeling progress …

A survey of deep learning for mathematical reasoning

P Lu, L Qiu, W Yu, S Welleck, KW Chang - arXiv preprint arXiv:2212.10535, 2022 - arxiv.org
Mathematical reasoning is a fundamental aspect of human intelligence and is applicable in
various fields, including science, engineering, finance, and everyday life. The development …

Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks

W Chen, X Ma, X Wang, WW Cohen - arXiv preprint arXiv:2211.12588, 2022 - arxiv.org
Recently, there has been significant progress in teaching language models to perform step-
by-step reasoning to solve complex numerical reasoning tasks. Chain-of-thoughts prompting …

Can llm already serve as a database interface? a big bench for large-scale database grounded text-to-sqls

J Li, B Hui, G Qu, J Yang, B Li, B Li… - Advances in …, 2024 - proceedings.neurips.cc
Text-to-SQL parsing, which aims at converting natural language instructions into executable
SQLs, has gained increasing attention in recent years. In particular, GPT-4 and Claude-2 …

Dynamic prompt learning via policy gradient for semi-structured mathematical reasoning

P Lu, L Qiu, KW Chang, YN Wu, SC Zhu… - arXiv preprint arXiv …, 2022 - arxiv.org
Mathematical reasoning, a core ability of human intelligence, presents unique challenges for
machines in abstract thinking and logical reasoning. Recent large pre-trained language …

Theoremqa: A theorem-driven question answering dataset

W Chen, M Yin, M Ku, P Lu, Y Wan, X Ma… - Proceedings of the …, 2023 - aclanthology.org
The recent LLMs like GPT-4 and PaLM-2 have made tremendous progress in solving
fundamental math problems like GSM8K by achieving over 90% accuracy. However, their …

Pixiu: A large language model, instruction data and evaluation benchmark for finance

Q Xie, W Han, X Zhang, Y Lai, M Peng… - arXiv preprint arXiv …, 2023 - arxiv.org
Although large language models (LLMs) has shown great performance on natural language
processing (NLP) in the financial domain, there are no publicly available financial tailtored …

[HTML][HTML] A survey of GPT-3 family large language models including ChatGPT and GPT-4

KS Kalyan - Natural Language Processing Journal, 2023 - Elsevier
Large language models (LLMs) are a special class of pretrained language models (PLMs)
obtained by scaling model size, pretraining corpus and computation. LLMs, because of their …

Large language models are few (1)-shot table reasoners

W Chen - arXiv preprint arXiv:2210.06710, 2022 - arxiv.org
Recent literature has shown that large language models (LLMs) are generally excellent few-
shot reasoners to solve text reasoning tasks. However, the capability of LLMs on table …

MultiHiertt: Numerical reasoning over multi hierarchical tabular and textual data

Y Zhao, Y Li, C Li, R Zhang - arXiv preprint arXiv:2206.01347, 2022 - arxiv.org
Numerical reasoning over hybrid data containing both textual and tabular content (eg,
financial reports) has recently attracted much attention in the NLP community. However …