Factuality challenges in the era of large language models and opportunities for fact-checking

I Augenstein, T Baldwin, M Cha… - Nature Machine …, 2024 - nature.com
The emergence of tools based on large language models (LLMs), such as OpenAI's
ChatGPT and Google's Gemini, has garnered immense public attention owing to their …

The mechanistic basis of data dependence and abrupt learning in an in-context classification task

G Reddy - The Twelfth International Conference on Learning …, 2023 - openreview.net
Transformer models exhibit in-context learning: the ability to accurately predict the response
to a novel query based on illustrative examples in the input sequence, which contrasts with …

Can language models handle recursively nested grammatical structures? A case study on comparing models and humans

A Lampinen - Computational Linguistics, 2024 - direct.mit.edu
How should we compare the capabilities of language models (LMs) and humans? In this
paper, I draw inspiration from comparative psychology to highlight challenges in these …

In-context principle learning from mistakes

T Zhang, A Madaan, L Gao, S Zheng, S Mishra… - arXiv preprint arXiv …, 2024 - arxiv.org
In-context learning (ICL, also known as few-shot prompting) has been the standard method
of adapting LLMs to downstream tasks, by learning from a few input-output examples …

Retrieval-augmented generation to improve math question-answering: Trade-offs between groundedness and human preference

Z Levonian, C Li, W Zhu, A Gade, O Henkel… - arXiv preprint arXiv …, 2023 - arxiv.org
For middle-school math students, interactive question-answering (QA) with tutors is an
effective way to learn. The flexibility and emergent capabilities of generative large language …

What Makes Multimodal In-Context Learning Work?

FB Baldassini, M Shukor, M Cord… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract Large Language Models have demonstrated remarkable performance across
various tasks exhibiting the capacity to swiftly acquire new skills such as through In-Context …

The ARRT of Language-Models-as-a-Service: Overview of a New Paradigm and its Challenges

E La Malfa, A Petrov, S Frieder, C Weinhuber… - arXiv preprint arXiv …, 2023 - arxiv.org
Some of the most powerful language models currently are proprietary systems, accessible
only via (typically restrictive) web or software programming interfaces. This is the Language …

Competition-level problems are effective llm evaluators

Y Huang, Z Lin, X Liu, Y Gong, S Lu, F Lei… - Findings of the …, 2024 - aclanthology.org
Large language models (LLMs) have demonstrated impressive reasoning capabilities, yet
there is ongoing debate about these abilities and the potential data contamination problem …

Do llm agents have regret? a case study in online learning and games

C Park, X Liu, A Ozdaglar, K Zhang - arXiv preprint arXiv:2403.16843, 2024 - arxiv.org
Large language models (LLMs) have been increasingly employed for (interactive) decision-
making, via the development of LLM-based autonomous agents. Despite their emerging …

How capable can a transformer become? a study on synthetic, interpretable tasks

R Ramesh, M Khona, RP Dick, H Tanaka… - arXiv preprint arXiv …, 2023 - arxiv.org
Transformers trained on huge text corpora exhibit a remarkable set of capabilities, eg,
performing simple logical operations. Given the inherent compositional nature of language …