Foundational challenges in assuring alignment and safety of large language models

U Anwar, A Saparov, J Rando, D Paleka… - arXiv preprint arXiv …, 2024 - arxiv.org
This work identifies 18 foundational challenges in assuring the alignment and safety of large
language models (LLMs). These challenges are organized into three different categories …

A phase transition in diffusion models reveals the hierarchical nature of data

A Sclocchi, A Favero, M Wyart - Proceedings of the National Academy of …, 2025 - pnas.org
Understanding the structure of real data is paramount in advancing modern deep-learning
methodologies. Natural data such as images are believed to be composed of features …

Mechanistically analyzing the effects of fine-tuning on procedurally defined tasks

S Jain, R Kirk, ES Lubana, RP Dick, H Tanaka… - arXiv preprint arXiv …, 2023 - arxiv.org
Fine-tuning large pre-trained models has become the de facto strategy for developing both
task-specific and general-purpose machine learning systems, including developing models …

Eclipse: A resource-efficient text-to-image prior for image generations

M Patel, C Kim, S Cheng, C Baral… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract Text-to-image (T2I) diffusion models notably the unCLIP models (eg DALL-E-2)
achieve state-of-the-art (SOTA) performance on various compositional T2I benchmarks at …

Who's in and who's out? A case study of multimodal CLIP-filtering in DataComp

R Hong, W Agnew, T Kohno… - Proceedings of the 4th …, 2024 - dl.acm.org
As training datasets become increasingly drawn from unstructured, uncontrolled
environments such as the web, researchers and industry practitioners have increasingly …

How capable can a transformer become? a study on synthetic, interpretable tasks

R Ramesh, M Khona, RP Dick, H Tanaka, ES Lubana - 2023 - par.nsf.gov
Transformers trained on huge text corpora exhibit a remarkable set of capabilities, eg,
performing simple logical operations. Given the inherent compositional nature of language …

An analytic theory of creativity in convolutional diffusion models

M Kamb, S Ganguli - arXiv preprint arXiv:2412.20292, 2024 - arxiv.org
We obtain the first analytic, interpretable and predictive theory of creativity in convolutional
diffusion models. Indeed, score-based diffusion models can generate highly creative images …

Why do animals need shaping? a theory of task composition and curriculum learning

JH Lee, SS Mannelli, A Saxe - arXiv preprint arXiv:2402.18361, 2024 - arxiv.org
Diverse studies in systems neuroscience begin with extended periods of training known as'
shaping'procedures. These involve progressively studying component parts of more …

Skews in the Phenomenon Space Hinder Generalization in Text-to-Image Generation

Y Chang, Y Zhang, Z Fang, Y Wu, Y Bisk… - arXiv preprint arXiv …, 2024 - arxiv.org
The literature on text-to-image generation is plagued by issues of faithfully composing
entities with relations. But there lacks a formal understanding of how entity-relation …

Initialization is Critical to Whether Transformers Fit Composite Functions by Inference or Memorizing

Z Zhang, P Lin, Z Wang, Y Zhang, ZQJ Xu - arXiv preprint arXiv …, 2024 - arxiv.org
Transformers have shown impressive capabilities across various tasks, but their
performance on compositional problems remains a topic of debate. In this work, we …