Diffusion models: A comprehensive survey of methods and applications

L Yang, Z Zhang, Y Song, S Hong, R Xu, Y Zhao… - ACM Computing …, 2023 - dl.acm.org
Diffusion models have emerged as a powerful new family of deep generative models with
record-breaking performance in many applications, including image synthesis, video …

Unleashing the power of edge-cloud generative ai in mobile networks: A survey of aigc services

M Xu, H Du, D Niyato, J Kang, Z Xiong… - … Surveys & Tutorials, 2024 - ieeexplore.ieee.org
Artificial Intelligence-Generated Content (AIGC) is an automated method for generating,
manipulating, and modifying valuable and diverse data using AI algorithms creatively. This …

Align your latents: High-resolution video synthesis with latent diffusion models

A Blattmann, R Rombach, H Ling… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract Latent Diffusion Models (LDMs) enable high-quality image synthesis while avoiding
excessive compute demands by training a diffusion model in a compressed lower …

Tune-a-video: One-shot tuning of image diffusion models for text-to-video generation

JZ Wu, Y Ge, X Wang, SW Lei, Y Gu… - Proceedings of the …, 2023 - openaccess.thecvf.com
To replicate the success of text-to-image (T2I) generation, recent works employ large-scale
video datasets to train a text-to-video (T2V) generator. Despite their promising results, such …

Reproducible scaling laws for contrastive language-image learning

M Cherti, R Beaumont, R Wightman… - Proceedings of the …, 2023 - openaccess.thecvf.com
Scaling up neural networks has led to remarkable performance across a wide range of
tasks. Moreover, performance often follows reliable scaling laws as a function of training set …

Text2video-zero: Text-to-image diffusion models are zero-shot video generators

L Khachatryan, A Movsisyan… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recent text-to-video generation approaches rely on computationally heavy training and
require large-scale video datasets. In this paper, we introduce a new task, zero-shot text-to …

Sdxl: Improving latent diffusion models for high-resolution image synthesis

D Podell, Z English, K Lacey, A Blattmann… - arXiv preprint arXiv …, 2023 - arxiv.org
We present SDXL, a latent diffusion model for text-to-image synthesis. Compared to
previous versions of Stable Diffusion, SDXL leverages a three times larger UNet backbone …

Structure and content-guided video synthesis with diffusion models

P Esser, J Chiu, P Atighehchian… - Proceedings of the …, 2023 - openaccess.thecvf.com
Text-guided generative diffusion models unlock powerful image creation and editing tools.
Recent approaches that edit the content of footage while retaining structure require …

ediff-i: Text-to-image diffusion models with an ensemble of expert denoisers

Y Balaji, S Nah, X Huang, A Vahdat, J Song… - arXiv preprint arXiv …, 2022 - arxiv.org
Large-scale diffusion-based generative models have led to breakthroughs in text-
conditioned high-resolution image synthesis. Starting from random noise, such text-to-image …

Consistency models

Y Song, P Dhariwal, M Chen, I Sutskever - arXiv preprint arXiv:2303.01469, 2023 - arxiv.org
Diffusion models have significantly advanced the fields of image, audio, and video
generation, but they depend on an iterative sampling process that causes slow generation …