Photomaker: Customizing realistic human photos via stacked id embedding
Recent advances in text-to-image generation have made remarkable progress in
synthesizing realistic human photos conditioned on given text prompts. However existing …
synthesizing realistic human photos conditioned on given text prompts. However existing …
Motion-i2v: Consistent and controllable image-to-video generation with explicit motion modeling
We introduce Motion-I2V, a novel framework for consistent and controllable text-guided
image-to-video generation (I2V). In contrast to previous methods that directly learn the …
image-to-video generation (I2V). In contrast to previous methods that directly learn the …
Direct-a-video: Customized video generation with user-directed camera movement and object motion
Recent text-to-video diffusion models have achieved impressive progress. In practice, users
often desire the ability to control object motion and camera movement independently for …
often desire the ability to control object motion and camera movement independently for …
Tc4d: Trajectory-conditioned text-to-4d generation
Recent techniques for text-to-4D generation synthesize dynamic 3D scenes using
supervision from pre-trained text-to-video models. However, existing representations for …
supervision from pre-trained text-to-video models. However, existing representations for …
Cameractrl: Enabling camera control for text-to-video generation
Controllability plays a crucial role in video generation since it allows users to create desired
content. However, existing models largely overlooked the precise control of camera pose …
content. However, existing models largely overlooked the precise control of camera pose …
Cat3d: Create anything in 3d with multi-view diffusion models
Advances in 3D reconstruction have enabled high-quality 3D capture, but require a user to
collect hundreds to thousands of images to create a 3D scene. We present CAT3D, a …
collect hundreds to thousands of images to create a 3D scene. We present CAT3D, a …
Boximator: Generating rich and controllable motions for video synthesis
Generating rich and controllable motion is a pivotal challenge in video synthesis. We
propose Boximator, a new approach for fine-grained motion control. Boximator introduces …
propose Boximator, a new approach for fine-grained motion control. Boximator introduces …
Dragapart: Learning a part-level motion prior for articulated objects
We introduce DragAPart, a method that, given an image and a set of drags as input, can
generate a new image of the same object in a new state, compatible with the action of the …
generate a new image of the same object in a new state, compatible with the action of the …
Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability
World models can foresee the outcomes of different actions, which is of paramount
importance for autonomous driving. Nevertheless, existing driving world models still have …
importance for autonomous driving. Nevertheless, existing driving world models still have …
Image Conductor: Precision Control for Interactive Video Synthesis
Filmmaking and animation production often require sophisticated techniques for
coordinating camera transitions and object movements, typically involving labor-intensive …
coordinating camera transitions and object movements, typically involving labor-intensive …