ControlNet : Improving Conditional Controls with Efficient Consistency Feedback
To enhance the controllability of text-to-image diffusion models, existing efforts like
ControlNet incorporated image-based conditional controls. In this paper, we reveal that …
ControlNet incorporated image-based conditional controls. In this paper, we reveal that …
Dragapart: Learning a part-level motion prior for articulated objects
We introduce DragAPart, a method that, given an image and a set of drags as input,
generates a new image of the same object that responds to the action of the drags …
generates a new image of the same object that responds to the action of the drags …
Diffusion models for monocular depth estimation: Overcoming challenging conditions
We present a novel approach designed to address the complexities posed by challenging,
out-of-distribution data in the single-image depth estimation task. Starting with images that …
out-of-distribution data in the single-image depth estimation task. Starting with images that …
Smartcontrol: Enhancing controlnet for handling rough visual conditions
Recent text-to-image generation methods such as ControlNet have achieved remarkable
success in controlling image layouts, where the generated images by the default model are …
success in controlling image layouts, where the generated images by the default model are …
Controllable generation with text-to-image diffusion models: A survey
In the rapidly advancing realm of visual generation, diffusion models have revolutionized the
landscape, marking a significant shift in capabilities with their impressive text-guided …
landscape, marking a significant shift in capabilities with their impressive text-guided …
Multi-modal generative ai: Multi-modal llm, diffusion and beyond
Multi-modal generative AI has received increasing attention in both academia and industry.
Particularly, two dominant families of techniques are: i) The multi-modal large language …
Particularly, two dominant families of techniques are: i) The multi-modal large language …
Anycontrol: create your artwork with versatile control on text-to-image generation
The field of text-to-image (T2I) generation has made significant progress in recent years,
largely driven by advancements in diffusion models. Linguistic control enables effective …
largely driven by advancements in diffusion models. Linguistic control enables effective …
When ControlNet Meets Inexplicit Masks: A Case Study of ControlNet on its Contour-following Ability
ControlNet excels at creating content that closely matches precise contours in user-provided
masks. However, when these masks contain noise, as a frequent occurrence with non …
masks. However, when these masks contain noise, as a frequent occurrence with non …
Bootpig: Bootstrapping zero-shot personalized image generation capabilities in pretrained diffusion models
Recent text-to-image generation models have demonstrated incredible success in
generating images that faithfully follow input prompts. However, the requirement of using …
generating images that faithfully follow input prompts. However, the requirement of using …
A Survey of Multimodal-Guided Image Editing with Text-to-Image Diffusion Models
Image editing aims to edit the given synthetic or real image to meet the specific requirements
from users. It is widely studied in recent years as a promising and challenging field of …
from users. It is widely studied in recent years as a promising and challenging field of …