Teaching arithmetic to small transformers
Large language models like GPT-4 exhibit emergent capabilities across general-purpose
tasks, such as basic arithmetic, when trained on extensive text data, even though these tasks …
tasks, such as basic arithmetic, when trained on extensive text data, even though these tasks …
What algorithms can transformers learn? a study in length generalization
Large language models exhibit surprising emergent generalization properties, yet also
struggle on many simple reasoning tasks such as arithmetic and parity. This raises the …
struggle on many simple reasoning tasks such as arithmetic and parity. This raises the …
Teaching algorithmic reasoning via in-context learning
Large language models (LLMs) have shown increasing in-context learning capabilities
through scaling up model and data size. Despite this progress, LLMs are still unable to solve …
through scaling up model and data size. Despite this progress, LLMs are still unable to solve …
Can neural networks do arithmetic? a survey on the elementary numerical skills of state-of-the-art deep learning models
A Testolin - Applied Sciences, 2024 - mdpi.com
Creating learning models that can exhibit sophisticated reasoning abilities is one of the
greatest challenges in deep learning research, and mathematics is rapidly becoming one of …
greatest challenges in deep learning research, and mathematics is rapidly becoming one of …
Goat: Fine-tuned llama outperforms gpt-4 on arithmetic tasks
We introduce Goat, a fine-tuned LLaMA model that significantly outperforms GPT-4 on a
range of arithmetic tasks. Fine-tuned on a synthetically generated dataset, Goat achieves …
range of arithmetic tasks. Fine-tuned on a synthetically generated dataset, Goat achieves …
Identifying weaknesses in machine translation metrics through minimum Bayes risk decoding: A case study for COMET
C Amrhein, R Sennrich - arXiv preprint arXiv:2202.05148, 2022 - arxiv.org
Neural metrics have achieved impressive correlation with human judgements in the
evaluation of machine translation systems, but before we can safely optimise towards such …
evaluation of machine translation systems, but before we can safely optimise towards such …
Large language models: a primer and gastroenterology applications
Over the past year, the emergence of state-of-the-art large language models (LLMs) in tools
like ChatGPT has ushered in a rapid acceleration in artificial intelligence (AI) innovation …
like ChatGPT has ushered in a rapid acceleration in artificial intelligence (AI) innovation …
Exploring the numerical reasoning capabilities of language models: A comprehensive analysis on tabular data
Numbers are crucial for various real-world domains such as finance, economics, and
science. Thus, understanding and reasoning with numbers are essential skills for language …
science. Thus, understanding and reasoning with numbers are essential skills for language …
Induced natural language rationales and interleaved markup tokens enable extrapolation in large language models
The ability to extrapolate, ie, to make predictions on sequences that are longer than those
presented as training examples, is a challenging problem for current deep learning models …
presented as training examples, is a challenging problem for current deep learning models …
Assessing GPT-3.5 and GPT-4 in generating international classification of diseases billing codes
Background Large Language Models (LLMs) like GPT-3.5 and GPT-4 are increasingly
entering the healthcare domain as a proposed means to assist with administrative tasks. To …
entering the healthcare domain as a proposed means to assist with administrative tasks. To …