A survey on video diffusion models

Z Xing, Q Feng, H Chen, Q Dai, H Hu, H Xu… - ACM Computing …, 2023 - dl.acm.org
The recent wave of AI-generated content (AIGC) has witnessed substantial success in
computer vision, with the diffusion model playing a crucial role in this achievement. Due to …

Vbench: Comprehensive benchmark suite for video generative models

Z Huang, Y He, J Yu, F Zhang, C Si… - Proceedings of the …, 2024 - openaccess.thecvf.com
Video generation has witnessed significant advancements yet evaluating these models
remains a challenge. A comprehensive evaluation benchmark for video generation is …

Videocrafter2: Overcoming data limitations for high-quality video diffusion models

H Chen, Y Zhang, X Cun, M Xia… - Proceedings of the …, 2024 - openaccess.thecvf.com
Text-to-video generation aims to produce a video based on a given prompt. Recently
several commercial video models have been able to generate plausible videos with minimal …

I2vgen-xl: High-quality image-to-video synthesis via cascaded diffusion models

S Zhang, J Wang, Y Zhang, K Zhao, H Yuan… - arXiv preprint arXiv …, 2023 - arxiv.org
Video synthesis has recently made remarkable strides benefiting from the rapid
development of diffusion models. However, it still encounters challenges in terms of …

Sparsectrl: Adding sparse controls to text-to-video diffusion models

Y Guo, C Yang, A Rao, M Agrawala, D Lin… - arXiv preprint arXiv …, 2023 - arxiv.org
The development of text-to-video (T2V), ie, generating videos with a given text prompt, has
been significantly advanced in recent years. However, relying solely on text prompts often …

Motion-i2v: Consistent and controllable image-to-video generation with explicit motion modeling

X Shi, Z Huang, FY Wang, W Bian, D Li… - ACM SIGGRAPH 2024 …, 2024 - dl.acm.org
We introduce Motion-I2V, a novel framework for consistent and controllable text-guided
image-to-video generation (I2V). In contrast to previous methods that directly learn the …

Breathing Life Into Sketches Using Text-to-Video Priors

R Gal, Y Vinker, Y Alaluf, A Bermano… - Proceedings of the …, 2024 - openaccess.thecvf.com
A sketch is one of the most intuitive and versatile tools humans use to convey their ideas
visually. An animated sketch opens another dimension to the expression of ideas and is …

Streamingt2v: Consistent, dynamic, and extendable long video generation from text

R Henschel, L Khachatryan, D Hayrapetyan… - arXiv preprint arXiv …, 2024 - arxiv.org
Text-to-video diffusion models enable the generation of high-quality videos that follow text
instructions, making it easy to create diverse and individual content. However, existing …

Follow-your-click: Open-domain regional image animation via short prompts

Y Ma, Y He, H Wang, A Wang, C Qi, C Cai, X Li… - arXiv preprint arXiv …, 2024 - arxiv.org
Despite recent advances in image-to-video generation, better controllability and local
animation are less explored. Most existing image-to-video methods are not locally aware …

Anyv2v: A plug-and-play framework for any video-to-video editing tasks

M Ku, C Wei, W Ren, H Yang, W Chen - arXiv preprint arXiv:2403.14468, 2024 - arxiv.org
Video-to-video editing involves editing a source video along with additional control (such as
text prompts, subjects, or styles) to generate a new video that aligns with the source video …