Imagen video: High definition video generation with diffusion models

L Yang, Z Zhang, Y Song, S Hong, R Xu, Y Zhao… - ACM Computing …, 2023 - dl.acm.org

Diffusion models have emerged as a powerful new family of deep generative models with
record-breaking performance in many applications, including image synthesis, video …

被引用次数：911 相关文章所有 6 个版本

[PDF] arxiv.org

Unleashing the power of edge-cloud generative ai in mobile networks: A survey of aigc services

M Xu, H Du, D Niyato, J Kang, Z Xiong… - … Surveys & Tutorials, 2024 - ieeexplore.ieee.org

Artificial Intelligence-Generated Content (AIGC) is an automated method for generating,
manipulating, and modifying valuable and diverse data using AI algorithms creatively. This …

被引用次数：114 相关文章所有 5 个版本

[PDF] thecvf.com

Align your latents: High-resolution video synthesis with latent diffusion models

A Blattmann, R Rombach, H Ling… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract Latent Diffusion Models (LDMs) enable high-quality image synthesis while avoiding
excessive compute demands by training a diffusion model in a compressed lower …

被引用次数：506 相关文章所有 6 个版本

[PDF] thecvf.com

Tune-a-video: One-shot tuning of image diffusion models for text-to-video generation

JZ Wu, Y Ge, X Wang, SW Lei, Y Gu… - Proceedings of the …, 2023 - openaccess.thecvf.com

To replicate the success of text-to-image (T2I) generation, recent works employ large-scale
video datasets to train a text-to-video (T2V) generator. Despite their promising results, such …

被引用次数：431 相关文章所有 4 个版本

[PDF] thecvf.com

Reproducible scaling laws for contrastive language-image learning

M Cherti, R Beaumont, R Wightman… - Proceedings of the …, 2023 - openaccess.thecvf.com

Scaling up neural networks has led to remarkable performance across a wide range of
tasks. Moreover, performance often follows reliable scaling laws as a function of training set …

被引用次数：388 相关文章所有 6 个版本

[PDF] thecvf.com

Text2video-zero: Text-to-image diffusion models are zero-shot video generators

L Khachatryan, A Movsisyan… - Proceedings of the …, 2023 - openaccess.thecvf.com

Recent text-to-video generation approaches rely on computationally heavy training and
require large-scale video datasets. In this paper, we introduce a new task, zero-shot text-to …

被引用次数：288 相关文章所有 7 个版本

[PDF] arxiv.org

Sdxl: Improving latent diffusion models for high-resolution image synthesis

D Podell, Z English, K Lacey, A Blattmann… - arXiv preprint arXiv …, 2023 - arxiv.org

We present SDXL, a latent diffusion model for text-to-image synthesis. Compared to
previous versions of Stable Diffusion, SDXL leverages a three times larger UNet backbone …

被引用次数：661 相关文章所有 4 个版本

[PDF] thecvf.com

Structure and content-guided video synthesis with diffusion models

P Esser, J Chiu, P Atighehchian… - Proceedings of the …, 2023 - openaccess.thecvf.com

Text-guided generative diffusion models unlock powerful image creation and editing tools.
Recent approaches that edit the content of footage while retaining structure require …

被引用次数：304 相关文章所有 5 个版本

ediff-i: Text-to-image diffusion models with an ensemble of expert denoisers

Y Balaji, S Nah, X Huang, A Vahdat, J Song… - arXiv preprint arXiv …, 2022 - arxiv.org

Large-scale diffusion-based generative models have led to breakthroughs in text-
conditioned high-resolution image synthesis. Starting from random noise, such text-to-image …

被引用次数：517 相关文章所有 2 个版本

[PDF] arxiv.org

Consistency models

Y Song, P Dhariwal, M Chen, I Sutskever - arXiv preprint arXiv:2303.01469, 2023 - arxiv.org

Diffusion models have significantly advanced the fields of image, audio, and video
generation, but they depend on an iterative sampling process that causes slow generation …

被引用次数：436 相关文章所有 9 个版本