LatentWarp: Consistent Diffusion Latents for Zero-Shot Video-to-Video Translation

Y Feng, S Gao, Y Bao, X Wang, S Han, J Zhang… - … on Computer Vision, 2025 - Springer

Text-driven video editing has emerged as a prominent application based on the
breakthroughs of image diffusion models. Existing state-of-the-art methods focus on zero …

被引用次数：2 相关文章所有 4 个版本

[PDF] arxiv.org

LLMs Meet Multimodal Generation and Editing: A Survey

Y He, Z Liu, J Chen, Z Tian, H Liu, X Chi, R Liu… - arXiv preprint arXiv …, 2024 - arxiv.org

With the recent advancement in large language models (LLMs), there is a growing interest in
combining LLMs with multimodal learning. Previous surveys of multimodal large language …

被引用次数：12 相关文章所有 2 个版本

[PDF] arxiv.org

MotionCharacter: Identity-Preserving and Motion Controllable Human Video Generation

H Fang, D Qiu, B Mao, P Yan, H Tang - arXiv preprint arXiv:2411.18281, 2024 - arxiv.org

Recent advancements in personalized Text-to-Video (T2V) generation highlight the
importance of integrating character-specific identities and actions. However, previous T2V …

MovieCharacter: A Tuning-Free Framework for Controllable Character Video Synthesis

D Qiu, Z Chen, R Wang, M Fan, C Yu, J Huan… - arXiv preprint arXiv …, 2024 - arxiv.org

Recent advancements in character video synthesis still depend on extensive fine-tuning or
complex 3D modeling processes, which can restrict accessibility and hinder real-time …

MotionCraft: Physics-based Zero-Shot Video Generation

LS Aira, A Montanaro, E Aiello, D Valsesia… - arXiv preprint arXiv …, 2024 - arxiv.org

Generating videos with realistic and physically plausible motion is one of the main recent
challenges in computer vision. While diffusion models are achieving compelling results in …

[PDF] openreview.net

MotionCraft: Physics-Based Zero-Shot Video Generation

A Montanaro, LS Aira, E Aiello, D Valsesia… - The Thirty-eighth Annual … - openreview.net

Generating videos with realistic and physically plausible motion is one of the main recent
challenges in computer vision. While diffusion models are achieving compelling results in …