A survey on video diffusion models

Z Xing, Q Feng, H Chen, Q Dai, H Hu, H Xu… - ACM Computing …, 2023 - dl.acm.org
The recent wave of AI-generated content (AIGC) has witnessed substantial success in
computer vision, with the diffusion model playing a crucial role in this achievement. Due to …

Sora: A review on background, technology, limitations, and opportunities of large vision models

Y Liu, K Zhang, Y Li, Z Yan, C Gao, R Chen… - arXiv preprint arXiv …, 2024 - arxiv.org
Sora is a text-to-video generative AI model, released by OpenAI in February 2024. The
model is trained to generate videos of realistic or imaginative scenes from text instructions …

Dynamicrafter: Animating open-domain images with video diffusion priors

J Xing, M Xia, Y Zhang, H Chen, W Yu, H Liu… - … on Computer Vision, 2025 - Springer
Animating a still image offers an engaging visual experience. Traditional image animation
techniques mainly focus on animating natural scenes with stochastic dynamics (eg clouds …

State of the art on diffusion models for visual computing

R Po, W Yifan, V Golyanik, K Aberman… - Computer Graphics …, 2024 - Wiley Online Library
The field of visual computing is rapidly advancing due to the emergence of generative
artificial intelligence (AI), which unlocks unprecedented capabilities for the generation …

Livephoto: Real image animation with text-guided motion control

X Chen, Z Liu, M Chen, Y Feng, Y Liu, Y Shen… - … on Computer Vision, 2025 - Springer
Despite the recent progress in text-to-video generation, existing studies usually overlook the
issue that only spatial contents but not temporal motions in synthesized videos are under the …

Pia: Your personalized image animator via plug-and-play modules in text-to-image models

Y Zhang, Z Xing, Y Zeng, Y Fang… - Proceedings of the …, 2024 - openaccess.thecvf.com
Recent advancements in personalized text-to-image (T2I) models have revolutionized
content creation empowering non-experts to generate stunning images with unique styles …

Motion-i2v: Consistent and controllable image-to-video generation with explicit motion modeling

X Shi, Z Huang, FY Wang, W Bian, D Li… - ACM SIGGRAPH 2024 …, 2024 - dl.acm.org
We introduce Motion-I2V, a novel framework for consistent and controllable text-guided
image-to-video generation (I2V). In contrast to previous methods that directly learn the …

Physdreamer: Physics-based interaction with 3d objects via video generation

T Zhang, HX Yu, R Wu, BY Feng, C Zheng… - … on Computer Vision, 2025 - Springer
Realistic object interactions are crucial for creating immersive virtual experiences, yet
synthesizing realistic 3D object dynamics in response to novel interactions remains a …

Editablenerf: Editing topologically varying neural radiance fields by key points

C Zheng, W Lin, F Xu - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com
Neural radiance fields (NeRF) achieve highly photo-realistic novel-view synthesis, but it's a
challenging problem to edit the scenes modeled by NeRF-based methods, especially for …

Dragapart: Learning a part-level motion prior for articulated objects

R Li, C Zheng, C Rupprecht, A Vedaldi - European Conference on …, 2025 - Springer
We introduce DragAPart, a method that, given an image and a set of drags as input,
generates a new image of the same object that responds to the action of the drags …