Compositional abilities emerge multiplicatively: Exploring diffusion models on a synthetic task

U Anwar, A Saparov, J Rando, D Paleka… - arXiv preprint arXiv …, 2024 - arxiv.org

This work identifies 18 foundational challenges in assuring the alignment and safety of large
language models (LLMs). These challenges are organized into three different categories …

被引用次数：101 相关文章所有 3 个版本

[PDF] pnas.org

A phase transition in diffusion models reveals the hierarchical nature of data

A Sclocchi, A Favero, M Wyart - Proceedings of the National Academy of …, 2025 - pnas.org

Understanding the structure of real data is paramount in advancing modern deep-learning
methodologies. Natural data such as images are believed to be composed of features …

被引用次数：16 相关文章所有 3 个版本

[PDF] arxiv.org

Mechanistically analyzing the effects of fine-tuning on procedurally defined tasks

S Jain, R Kirk, ES Lubana, RP Dick, H Tanaka… - arXiv preprint arXiv …, 2023 - arxiv.org

Fine-tuning large pre-trained models has become the de facto strategy for developing both
task-specific and general-purpose machine learning systems, including developing models …

被引用次数：42 相关文章所有 10 个版本

[PDF] thecvf.com

Eclipse: A resource-efficient text-to-image prior for image generations

M Patel, C Kim, S Cheng, C Baral… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract Text-to-image (T2I) diffusion models notably the unCLIP models (eg DALL-E-2)
achieve state-of-the-art (SOTA) performance on various compositional T2I benchmarks at …

被引用次数：14 相关文章所有 3 个版本

[PDF] acm.org

Who's in and who's out? A case study of multimodal CLIP-filtering in DataComp

R Hong, W Agnew, T Kohno… - Proceedings of the 4th …, 2024 - dl.acm.org

As training datasets become increasingly drawn from unstructured, uncontrolled
environments such as the web, researchers and industry practitioners have increasingly …

被引用次数：6 相关文章所有 2 个版本

[PDF] nsf.gov

How capable can a transformer become? a study on synthetic, interpretable tasks

R Ramesh, M Khona, RP Dick, H Tanaka, ES Lubana - 2023 - par.nsf.gov

Transformers trained on huge text corpora exhibit a remarkable set of capabilities, eg,
performing simple logical operations. Given the inherent compositional nature of language …

被引用次数：9 相关文章所有 6 个版本

[PDF] arxiv.org

An analytic theory of creativity in convolutional diffusion models

M Kamb, S Ganguli - arXiv preprint arXiv:2412.20292, 2024 - arxiv.org

We obtain the first analytic, interpretable and predictive theory of creativity in convolutional
diffusion models. Indeed, score-based diffusion models can generate highly creative images …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Why do animals need shaping? a theory of task composition and curriculum learning

JH Lee, SS Mannelli, A Saxe - arXiv preprint arXiv:2402.18361, 2024 - arxiv.org

Diverse studies in systems neuroscience begin with extended periods of training known as'
shaping'procedures. These involve progressively studying component parts of more …

被引用次数：6 相关文章所有 3 个版本

[PDF] arxiv.org

Skews in the Phenomenon Space Hinder Generalization in Text-to-Image Generation

Y Chang, Y Zhang, Z Fang, Y Wu, Y Bisk… - arXiv preprint arXiv …, 2024 - arxiv.org

The literature on text-to-image generation is plagued by issues of faithfully composing
entities with relations. But there lacks a formal understanding of how entity-relation …

被引用次数：3 相关文章所有 2 个版本

[PDF] arxiv.org

Initialization is Critical to Whether Transformers Fit Composite Functions by Inference or Memorizing

Z Zhang, P Lin, Z Wang, Y Zhang, ZQJ Xu - arXiv preprint arXiv …, 2024 - arxiv.org

Transformers have shown impressive capabilities across various tasks, but their
performance on compositional problems remains a topic of debate. In this work, we …

被引用次数：2 相关文章所有 3 个版本