Make-your-video: Customized video generation using textual and structural guidance

Z Xing, Q Feng, H Chen, Q Dai, H Hu, H Xu… - ACM Computing …, 2023 - dl.acm.org

The recent wave of AI-generated content (AIGC) has witnessed substantial success in
computer vision, with the diffusion model playing a crucial role in this achievement. Due to …

被引用次数：50 相关文章所有 3 个版本

[PDF] arxiv.org

Sora: A review on background, technology, limitations, and opportunities of large vision models

Y Liu, K Zhang, Y Li, Z Yan, C Gao, R Chen… - arXiv preprint arXiv …, 2024 - arxiv.org

Sora is a text-to-video generative AI model, released by OpenAI in February 2024. The
model is trained to generate videos of realistic or imaginative scenes from text instructions …

被引用次数：109 相关文章所有 2 个版本

[PDF] thecvf.com

Vbench: Comprehensive benchmark suite for video generative models

Z Huang, Y He, J Yu, F Zhang, C Si… - Proceedings of the …, 2024 - openaccess.thecvf.com

Video generation has witnessed significant advancements yet evaluating these models
remains a challenge. A comprehensive evaluation benchmark for video generation is …

被引用次数：68 相关文章所有 4 个版本

[PDF] arxiv.org

I2vgen-xl: High-quality image-to-video synthesis via cascaded diffusion models

S Zhang, J Wang, Y Zhang, K Zhao, H Yuan… - arXiv preprint arXiv …, 2023 - arxiv.org

Video synthesis has recently made remarkable strides benefiting from the rapid
development of diffusion models. However, it still encounters challenges in terms of …

被引用次数：83 相关文章所有 2 个版本

[PDF] arxiv.org

Dynamicrafter: Animating open-domain images with video diffusion priors

J Xing, M Xia, Y Zhang, H Chen, X Wang… - arXiv preprint arXiv …, 2023 - arxiv.org

Animating a still image offers an engaging visual experience. Traditional image animation
techniques mainly focus on animating natural scenes with stochastic dynamics (eg clouds …

被引用次数：63 相关文章所有 2 个版本

[PDF] thecvf.com

A recipe for scaling up text-to-video generation with text-free videos

X Wang, S Zhang, H Yuan, Z Qing… - Proceedings of the …, 2024 - openaccess.thecvf.com

Diffusion-based text-to-video generation has witnessed impressive progress in the past year
yet still falls behind text-to-image generation. One of the key reasons is the limited scale of …

被引用次数：8 相关文章所有 4 个版本

[PDF] openreview.net

Scalecrafter: Tuning-free higher-resolution visual generation with diffusion models

Y He, S Yang, H Chen, X Cun, M Xia… - The Twelfth …, 2023 - openreview.net

In this work, we investigate the capability of generating images from pre-trained diffusion
models at much higher resolutions than the training image sizes. In addition, the generated …

被引用次数：24 相关文章所有 3 个版本

[PDF] thecvf.com

ART-V: Auto-Regressive Text-to-Video Generation with Diffusion Models

W Weng, R Feng, Y Wang, Q Dai… - Proceedings of the …, 2024 - openaccess.thecvf.com

We present ART-V an efficient framework for auto-regressive video generation with diffusion
models. Unlike existing methods that generate entire videos in one-shot ART-V generates a …

被引用次数：7 相关文章所有 4 个版本

[PDF] arxiv.org

Animate-a-story: Storytelling with retrieval-augmented video generation

Y He, M Xia, H Chen, X Cun, Y Gong, J Xing… - arXiv preprint arXiv …, 2023 - arxiv.org

Generating videos for visual storytelling can be a tedious and complex process that typically
requires either live-action filming or graphics animation rendering. To bypass these …

被引用次数：36 相关文章所有 2 个版本

[PDF] thecvf.com

InstructVideo: instructing video diffusion models with human feedback

H Yuan, S Zhang, X Wang, Y Wei… - Proceedings of the …, 2024 - openaccess.thecvf.com

Diffusion models have emerged as the de facto paradigm for video generation. However
their reliance on web-scale data of varied quality often yields results that are visually …

被引用次数：10 相关文章所有 4 个版本