Motionbooth: Motion-aware customized text-to-video generation - 学术资源搜索

文章

学术资源搜索

获得 11 条结果（用时0.01秒）

我的图书馆

Motionbooth: Motion-aware customized text-to-video generation

在引用文章中搜索

[PDF] openreview.net

Non-uniform timestep sampling: Towards faster diffusion model training

T Zheng, C Geng, PT Jiang, B Wan, H Zhang… - Proceedings of the …, 2024 - dl.acm.org

Diffusion models have garnered significant success in generative tasks, emerging as the
predominant model in this domain. Despite their success, the substantial computational …

被引用次数：5 相关文章所有 3 个版本

[PDF] arxiv.org

Multi-modal generative ai: Multi-modal llm, diffusion and beyond

H Chen, X Wang, Y Zhou, B Huang, Y Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org

Multi-modal generative AI has received increasing attention in both academia and industry.
Particularly, two dominant families of techniques are: i) The multi-modal large language …

被引用次数：3 相关文章所有 2 个版本

[PDF] arxiv.org

Identity-Preserving Text-to-Video Generation by Frequency Decomposition

S Yuan, J Huang, X He, Y Ge, Y Shi, L Chen… - arXiv preprint arXiv …, 2024 - arxiv.org

Identity-preserving text-to-video (IPT2V) generation aims to create high-fidelity videos with
consistent human identity. It is an important task in video generation but remains an open …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

CamI2V: Camera-Controlled Image-to-Video Diffusion Model

G Zheng, T Li, R Jiang, Y Lu, T Wu, X Li - arXiv preprint arXiv:2410.15957, 2024 - arxiv.org

Recently, camera pose, as a user-friendly and physics-related condition, has been
introduced into text-to-video diffusion model for camera control. However, existing methods …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Motion Prompting: Controlling Video Generation with Motion Trajectories

D Geng, C Herrmann, J Hur, F Cole, S Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org

Motion control is crucial for generating expressive and compelling video content; however,
most existing video generation models rely mainly on text prompts for control, which struggle …

相关文章所有 3 个版本

[PDF] arxiv.org

Trajectory Attention for Fine-grained Video Motion Control

Z Xiao, W Ouyang, Y Zhou, S Yang, L Yang, J Si… - arXiv preprint arXiv …, 2024 - arxiv.org

Recent advancements in video generation have been greatly driven by video diffusion
models, with camera motion control emerging as a crucial challenge in creating view …

相关文章所有 2 个版本

[PDF] arxiv.org

MIMAFace: Face Animation via Motion-Identity Modulated Appearance Feature Learning

Y Han, J Zhu, Y Feng, X Ji, K He, X Li, Y Liu - arXiv preprint arXiv …, 2024 - arxiv.org

Current diffusion-based face animation methods generally adopt a ReferenceNet (a copy of
U-Net) and a large amount of curated self-acquired data to learn appearance features, as …

相关文章所有 3 个版本

[PDF] arxiv.org

DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation

J Wu, C Tang, J Wang, Y Zeng, X Li, Y Tong - arXiv preprint arXiv …, 2024 - arxiv.org

Story visualization, the task of creating visual narratives from textual descriptions, has seen
progress with text-to-image generation models. However, these models often lack effective …

相关文章所有 2 个版本

[PDF] arxiv.org

SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device

Y Wu, Z Zhang, Y Li, Y Xu, A Kag, Y Sui… - arXiv preprint arXiv …, 2024 - arxiv.org

We have witnessed the unprecedented success of diffusion-based video generation over
the past year. Recently proposed models from the community have wielded the power to …

相关文章所有 2 个版本

[PDF] arxiv.org

RelationBooth: Towards Relation-Aware Customized Object Generation

Q Shi, L Qi, J Wu, J Bai, J Wang, Y Tong, X Li… - arXiv preprint arXiv …, 2024 - arxiv.org

Customized image generation is crucial for delivering personalized content based on user-
provided image prompts, aligning large-scale text-to-image diffusion models with individual …

相关文章所有 3 个版本