Photomaker: Customizing realistic human photos via stacked id embedding

Z Li, M Cao, X Wang, Z Qi… - Proceedings of the …, 2024 - openaccess.thecvf.com
Recent advances in text-to-image generation have made remarkable progress in
synthesizing realistic human photos conditioned on given text prompts. However existing …

Motion-i2v: Consistent and controllable image-to-video generation with explicit motion modeling

X Shi, Z Huang, FY Wang, W Bian, D Li… - ACM SIGGRAPH 2024 …, 2024 - dl.acm.org
We introduce Motion-I2V, a novel framework for consistent and controllable text-guided
image-to-video generation (I2V). In contrast to previous methods that directly learn the …

Direct-a-video: Customized video generation with user-directed camera movement and object motion

S Yang, L Hou, H Huang, C Ma, P Wan… - ACM SIGGRAPH 2024 …, 2024 - dl.acm.org
Recent text-to-video diffusion models have achieved impressive progress. In practice, users
often desire the ability to control object motion and camera movement independently for …

Tc4d: Trajectory-conditioned text-to-4d generation

S Bahmani, X Liu, Y Wang, I Skorokhodov… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent techniques for text-to-4D generation synthesize dynamic 3D scenes using
supervision from pre-trained text-to-video models. However, existing representations for …

Cameractrl: Enabling camera control for text-to-video generation

H He, Y Xu, Y Guo, G Wetzstein, B Dai, H Li… - arXiv preprint arXiv …, 2024 - arxiv.org
Controllability plays a crucial role in video generation since it allows users to create desired
content. However, existing models largely overlooked the precise control of camera pose …

Cat3d: Create anything in 3d with multi-view diffusion models

R Gao, A Holynski, P Henzler, A Brussee… - arXiv preprint arXiv …, 2024 - arxiv.org
Advances in 3D reconstruction have enabled high-quality 3D capture, but require a user to
collect hundreds to thousands of images to create a 3D scene. We present CAT3D, a …

Boximator: Generating rich and controllable motions for video synthesis

J Wang, Y Zhang, J Zou, Y Zeng, G Wei… - arXiv preprint arXiv …, 2024 - arxiv.org
Generating rich and controllable motion is a pivotal challenge in video synthesis. We
propose Boximator, a new approach for fine-grained motion control. Boximator introduces …

Dragapart: Learning a part-level motion prior for articulated objects

R Li, C Zheng, C Rupprecht, A Vedaldi - arXiv preprint arXiv:2403.15382, 2024 - arxiv.org
We introduce DragAPart, a method that, given an image and a set of drags as input, can
generate a new image of the same object in a new state, compatible with the action of the …

Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability

S Gao, J Yang, L Chen, K Chitta, Y Qiu… - arXiv preprint arXiv …, 2024 - arxiv.org
World models can foresee the outcomes of different actions, which is of paramount
importance for autonomous driving. Nevertheless, existing driving world models still have …

Image Conductor: Precision Control for Interactive Video Synthesis

Y Li, X Wang, Z Zhang, Z Wang, Z Yuan, L Xie… - arXiv preprint arXiv …, 2024 - arxiv.org
Filmmaking and animation production often require sophisticated techniques for
coordinating camera transitions and object movements, typically involving labor-intensive …