Teaching arithmetic to small transformers

N Lee, K Sreenivasan, JD Lee, K Lee… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models like GPT-4 exhibit emergent capabilities across general-purpose
tasks, such as basic arithmetic, when trained on extensive text data, even though these tasks …

What algorithms can transformers learn? a study in length generalization

H Zhou, A Bradley, E Littwin, N Razin, O Saremi… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models exhibit surprising emergent generalization properties, yet also
struggle on many simple reasoning tasks such as arithmetic and parity. This raises the …

Teaching algorithmic reasoning via in-context learning

H Zhou, A Nova, H Larochelle, A Courville… - arXiv preprint arXiv …, 2022 - arxiv.org
Large language models (LLMs) have shown increasing in-context learning capabilities
through scaling up model and data size. Despite this progress, LLMs are still unable to solve …

Can neural networks do arithmetic? a survey on the elementary numerical skills of state-of-the-art deep learning models

A Testolin - Applied Sciences, 2024 - mdpi.com
Creating learning models that can exhibit sophisticated reasoning abilities is one of the
greatest challenges in deep learning research, and mathematics is rapidly becoming one of …

Goat: Fine-tuned llama outperforms gpt-4 on arithmetic tasks

T Liu, BKH Low - arXiv preprint arXiv:2305.14201, 2023 - arxiv.org
We introduce Goat, a fine-tuned LLaMA model that significantly outperforms GPT-4 on a
range of arithmetic tasks. Fine-tuned on a synthetically generated dataset, Goat achieves …

Identifying weaknesses in machine translation metrics through minimum Bayes risk decoding: A case study for COMET

C Amrhein, R Sennrich - arXiv preprint arXiv:2202.05148, 2022 - arxiv.org
Neural metrics have achieved impressive correlation with human judgements in the
evaluation of machine translation systems, but before we can safely optimise towards such …

Large language models: a primer and gastroenterology applications

O Shahab, B El Kurdi, A Shaukat… - Therapeutic …, 2024 - journals.sagepub.com
Over the past year, the emergence of state-of-the-art large language models (LLMs) in tools
like ChatGPT has ushered in a rapid acceleration in artificial intelligence (AI) innovation …

Exploring the numerical reasoning capabilities of language models: A comprehensive analysis on tabular data

M Akhtar, A Shankarampeta, V Gupta, A Patil… - arXiv preprint arXiv …, 2023 - arxiv.org
Numbers are crucial for various real-world domains such as finance, economics, and
science. Thus, understanding and reasoning with numbers are essential skills for language …

Induced natural language rationales and interleaved markup tokens enable extrapolation in large language models

M Bueno, C Gemmell, J Dalton, R Lotufo… - arXiv preprint arXiv …, 2022 - arxiv.org
The ability to extrapolate, ie, to make predictions on sequences that are longer than those
presented as training examples, is a challenging problem for current deep learning models …

Assessing GPT-3.5 and GPT-4 in generating international classification of diseases billing codes

A Soroush, BS Glicksberg, E Zimlichman, Y Barash… - medRxiv, 2023 - medrxiv.org
Background Large Language Models (LLMs) like GPT-3.5 and GPT-4 are increasingly
entering the healthcare domain as a proposed means to assist with administrative tasks. To …