Draganything: Motion control for anything using entity representation
We introduce DragAnything, which utilizes a entity representation to achieve motion control
for any object in controllable video generation. Comparison to existing motion control …
for any object in controllable video generation. Comparison to existing motion control …
Readout guidance: Learning control from diffusion features
Abstract We present Readout Guidance a method for controlling text-to-image diffusion
models with learned signals. Readout Guidance uses readout heads lightweight networks …
models with learned signals. Readout Guidance uses readout heads lightweight networks …
Magicdrive: Street view generation with diverse 3d geometry control
Recent advancements in diffusion models have significantly enhanced the data synthesis
with 2D control. Yet, precise 3D control in street view generation, crucial for 3D perception …
with 2D control. Yet, precise 3D control in street view generation, crucial for 3D perception …
Focus on your instruction: Fine-grained and multi-instruction image editing by attention modulation
Q Guo, T Lin - Proceedings of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com
Recently diffusion-based methods like InstructPix2Pix (IP2P) have achieved effective
instruction-based image editing requiring only natural language instructions from the user …
instruction-based image editing requiring only natural language instructions from the user …
Dginstyle: Domain-generalizable semantic segmentation with image diffusion models and stylized semantic control
Large, pretrained latent diffusion models (LDMs) have demonstrated an extraordinary ability
to generate creative content, specialize to user data through few-shot fine-tuning, and …
to generate creative content, specialize to user data through few-shot fine-tuning, and …
Hoidiffusion: Generating realistic 3d hand-object interaction data
Abstract 3D hand-object interaction data is scarce due to the hardware constraints in scaling
up the data collection process. In this paper we propose HOIDiffusion for generating realistic …
up the data collection process. In this paper we propose HOIDiffusion for generating realistic …
Generative models: What do they know? do they know things? let's find out!
Generative models excel at mimicking real scenes, suggesting they might inherently encode
important intrinsic scene properties. In this paper, we aim to explore the following key …
important intrinsic scene properties. In this paper, we aim to explore the following key …
Unigs: Unified representation for image generation and segmentation
This paper introduces a novel unified representation of diffusion models for image
generation and segmentation. Specifically we use a colormap to represent entity-level …
generation and segmentation. Specifically we use a colormap to represent entity-level …
Deepfake: definitions, performance metrics and standards, datasets, and a meta-review
Recent advancements in AI, especially deep learning, have contributed to a significant
increase in the creation of new realistic-looking synthetic media (video, image, and audio) …
increase in the creation of new realistic-looking synthetic media (video, image, and audio) …
Multimodal self-instruct: Synthetic abstract image and visual reasoning instruction using language model
Although most current large multimodal models (LMMs) can already understand photos of
natural scenes and portraits, their understanding of abstract images, eg, charts, maps, or …
natural scenes and portraits, their understanding of abstract images, eg, charts, maps, or …