Shadows Don't Lie and Lines Can't Bend! Generative Models don't know Projective Geometry... for now
Generative models can produce impressively realistic images. This paper demonstrates that
generated images have geometric features different from those of real images. We build a …
generated images have geometric features different from those of real images. We build a …
Amodal ground truth and completion in the wild
This paper studies amodal image segmentation: predicting entire object segmentation
masks including both visible and invisible (occluded) parts. In previous work the amodal …
masks including both visible and invisible (occluded) parts. In previous work the amodal …
Generative models: What do they know? do they know things? let's find out!
Generative models excel at mimicking real scenes, suggesting they might inherently encode
important intrinsic scene properties. In this paper, we aim to explore the following key …
important intrinsic scene properties. In this paper, we aim to explore the following key …
Lightit: Illumination modeling and control for diffusion models
We introduce LightIt a method for explicit illumination control for image generation. Recent
generative methods lack lighting control which is crucial to numerous artistic aspects of …
generative methods lack lighting control which is crucial to numerous artistic aspects of …
Faster diffusion: Rethinking the role of unet encoder in diffusion models
One of the key components within diffusion models is the UNet for noise prediction. While
several works have explored basic properties of the UNet decoder, its encoder largely …
several works have explored basic properties of the UNet decoder, its encoder largely …
Object pose estimation via the aggregation of diffusion features
T Wang, G Hu, H Wang - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com
Estimating the pose of objects from images is a crucial task of 3D scene understanding and
recent approaches have shown promising results on very large benchmarks. However these …
recent approaches have shown promising results on very large benchmarks. However these …
Amodal completion via progressive mixed context diffusion
Our brain can effortlessly recognize objects even when partially hidden from view. Seeing
the visible of the hidden is called amodal completion; however this task remains a challenge …
the visible of the hidden is called amodal completion; however this task remains a challenge …
Lexicon3d: Probing visual foundation models for complex 3d scene understanding
Complex 3D scene understanding has gained increasing attention, with scene encoding
strategies playing a crucial role in this success. However, the optimal scene encoding …
strategies playing a crucial role in this success. However, the optimal scene encoding …
Can Visual Foundation Models Achieve Long-term Point Tracking?
Large-scale vision foundation models have demonstrated remarkable success across
various tasks, underscoring their robust generalization capabilities. While their proficiency in …
various tasks, underscoring their robust generalization capabilities. While their proficiency in …
Viewpoint Textual Inversion: Discovering Scene Representations and 3D View Control in 2D Diffusion Models
Text-to-image diffusion models generate impressive and realistic images, but do they learn
to represent the 3D world from only 2D supervision? We demonstrate that yes, certain 3D …
to represent the 3D world from only 2D supervision? We demonstrate that yes, certain 3D …