Language model behavior: A comprehensive survey

TA Chang, BK Bergen - Computational Linguistics, 2024 - direct.mit.edu
Transformer language models have received widespread public attention, yet their
generated text is often surprising even to NLP researchers. In this survey, we discuss over …

Testing the general deductive reasoning capacity of large language models using ood examples

A Saparov, RY Pang, V Padmakumar… - Advances in …, 2024 - proceedings.neurips.cc
Given the intractably large size of the space of proofs, any model that is capable of general
deductive reasoning must generalize to proofs of greater complexity. Recent studies have …

How Do In-Context Examples Affect Compositional Generalization?

S An, Z Lin, Q Fu, B Chen, N Zheng, JG Lou… - arXiv preprint arXiv …, 2023 - arxiv.org
Compositional generalization--understanding unseen combinations of seen primitives--is an
essential reasoning capability in human intelligence. The AI community mainly studies this …

Instruct me more! random prompting for visual in-context learning

J Zhang, B Wang, L Li, Y Nakashima… - Proceedings of the …, 2024 - openaccess.thecvf.com
Large-scale models trained on extensive datasets, have emerged as the preferred approach
due to their high generalizability across various tasks. In-context learning (ICL), a popular …

How capable can a transformer become? a study on synthetic, interpretable tasks

R Ramesh, M Khona, RP Dick, H Tanaka… - arXiv preprint arXiv …, 2023 - arxiv.org
Transformers trained on huge text corpora exhibit a remarkable set of capabilities, eg,
performing simple logical operations. Given the inherent compositional nature of language …

Leveraging code to improve in-context learning for semantic parsing

B Bogin, S Gupta, P Clark, A Sabharwal - arXiv preprint arXiv:2311.09519, 2023 - arxiv.org
In-context learning (ICL) is an appealing approach for semantic parsing due to its few-shot
nature and improved generalization. However, learning to parse to rare domain-specific …

The validity of evaluation results: Assessing concurrence across compositionality benchmarks

K Sun, A Williams, D Hupkes - arXiv preprint arXiv:2310.17514, 2023 - arxiv.org
NLP models have progressed drastically in recent years, according to numerous datasets
proposed to evaluate performance. Questions remain, however, about how particular …

Adapt and Decompose: Efficient Generalization of Text-to-SQL via Domain Adapted Least-To-Most Prompting

A Arora, S Bhaisaheb, M Patwardhan, L Vig… - arXiv preprint arXiv …, 2023 - arxiv.org
Cross-domain and cross-compositional generalization of Text-to-SQL semantic parsing is a
challenging task. Existing Large Language Model (LLM) based solutions rely on inference …

Power-up! what can generative models do for human computation workflows?

G Allen, G He, U Gadiraju - arXiv preprint arXiv:2307.02243, 2023 - arxiv.org
We are amidst an explosion of artificial intelligence research, particularly around large
language models (LLMs). These models have a range of applications across domains like …

Attention as a Hypernetwork

S Schug, S Kobayashi, Y Akram, J Sacramento… - arXiv preprint arXiv …, 2024 - arxiv.org
Transformers can under some circumstances generalize to novel problem instances whose
constituent parts might have been encountered during training but whose compositions …