Shadows Don't Lie and Lines Can't Bend! Generative Models don't know Projective Geometry... for now

A Sarkar, H Mai, A Mahapatra… - Proceedings of the …, 2024 - openaccess.thecvf.com
Generative models can produce impressively realistic images. This paper demonstrates that
generated images have geometric features different from those of real images. We build a …

Stylegan knows normal, depth, albedo, and more

A Bhattad, D McKee, D Hoiem… - Advances in Neural …, 2024 - proceedings.neurips.cc
Intrinsic images, in the original sense, are image-like maps of scene properties like depth,
normal, albedo, or shading. This paper demonstrates that StyleGAN can easily be induced …

Image sculpting: Precise object editing with 3d geometry control

J Yenphraphai, X Pan, S Liu… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract We present Image Sculpting a new framework for editing 2D images by
incorporating tools from 3D geometry and graphics. This approach differs markedly from …

Diffusion Handles Enabling 3D Edits for Diffusion Models by Lifting Activations to 3D

K Pandey, P Guerrero, M Gadelha… - Proceedings of the …, 2024 - openaccess.thecvf.com
Diffusion handles is a novel approach to enable 3D object edits on diffusion images
requiring only existing pre-trained diffusion models depth estimation without any fine-tuning …

DORSal: Diffusion for Object-centric Representations of Scenes

A Jabri, S van Steenkiste, E Hoogeboom… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent progress in 3D scene understanding enables scalable learning of representations
across large datasets of diverse scenes. As a consequence, generalization to unseen …

Zest: Zero-shot material transfer from a single image

TY Cheng, P Sharma, A Markham, N Trigoni… - … on Computer Vision, 2025 - Springer
We propose ZeST, a method for zero-shot material transfer to an object in the input image
given a material exemplar image. ZeST leverages existing diffusion adapters to extract …

Customizing Text-to-Image Diffusion with Camera Viewpoint Control

N Kumari, G Su, R Zhang, T Park, E Shechtman… - arXiv preprint arXiv …, 2024 - arxiv.org
Model customization introduces new concepts to existing text-to-image models, enabling the
generation of the new concept in novel contexts. However, such methods lack accurate …

Videoshop: Localized Semantic Video Editing with Noise-Extrapolated Diffusion Inversion

X Fan, A Bhattad, R Krishna - arXiv preprint arXiv:2403.14617, 2024 - arxiv.org
We introduce Videoshop, a training-free video editing algorithm for localized semantic edits.
Videoshop allows users to use any editing software, including Photoshop and generative …

GeoDiffuser: Geometry-Based Image Editing with Diffusion Models

R Sajnani, J Vanbaar, J Min, K Katyal… - arXiv preprint arXiv …, 2024 - arxiv.org
The success of image generative models has enabled us to build methods that can edit
images based on text or other user input. However, these methods are bespoke, imprecise …

See, Imagine, Plan: Discovering and Hallucinating Tasks from a Single Image

C Ma, K Lu, TY Cheng, N Trigoni… - arXiv preprint arXiv …, 2024 - arxiv.org
Humans can not only recognize and understand the world in its current state but also
envision future scenarios that extend beyond immediate perception. To resemble this …