State of the art on diffusion models for visual computing

R Po, W Yifan, V Golyanik, K Aberman… - Computer Graphics …, 2024 - Wiley Online Library
The field of visual computing is rapidly advancing due to the emergence of generative
artificial intelligence (AI), which unlocks unprecedented capabilities for the generation …

Align your gaussians: Text-to-4d with dynamic 3d gaussians and composed diffusion models

H Ling, SW Kim, A Torralba… - Proceedings of the …, 2024 - openaccess.thecvf.com
Text-guided diffusion models have revolutionized image and video generation and have
also been successfully used for optimization-based 3D object synthesis. Here we instead …

Triplane meets gaussian splatting: Fast and generalizable single-view 3d reconstruction with transformers

ZX Zou, Z Yu, YC Guo, Y Li, D Liang… - Proceedings of the …, 2024 - openaccess.thecvf.com
Recent advancements in 3D reconstruction from single images have been driven by the
evolution of generative models. Prominent among these are methods based on Score …

Splatter image: Ultra-fast single-view 3d reconstruction

S Szymanowicz, C Rupprecht… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract We introduce the Splatter Image an ultra-efficient approach for monocular 3D object
reconstruction. Splatter Image is based on Gaussian Splatting which allows fast and high …

A comprehensive survey on 3D content generation

J Liu, X Huang, T Huang, L Chen, Y Hou… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent years have witnessed remarkable advances in artificial intelligence generated
content (AIGC), with diverse input modalities, eg, text, image, video, audio and 3D. The 3D is …

Lgm: Large multi-view gaussian model for high-resolution 3d content creation

J Tang, Z Chen, X Chen, T Wang, G Zeng… - arXiv preprint arXiv …, 2024 - arxiv.org
3D content creation has achieved significant progress in terms of both quality and speed.
Although current feed-forward models can produce 3D objects in seconds, their resolution is …

Crm: Single image to 3d textured mesh with convolutional reconstruction model

Z Wang, Y Wang, Y Chen, C Xiang, S Chen… - arXiv preprint arXiv …, 2024 - arxiv.org
Feed-forward 3D generative models like the Large Reconstruction Model (LRM) have
demonstrated exceptional generation speed. However, the transformer-based methods do …

4dgen: Grounded 4d content generation with spatial-temporal consistency

Y Yin, D Xu, Z Wang, Y Zhao, Y Wei - arXiv preprint arXiv:2312.17225, 2023 - arxiv.org
Aided by text-to-image and text-to-video diffusion models, existing 4D content creation
pipelines utilize score distillation sampling to optimize the entire dynamic 3D scene …

Learning the 3D Fauna of the Web

Z Li, D Litvak, R Li, Y Zhang, T Jakab… - Proceedings of the …, 2024 - openaccess.thecvf.com
Learning 3D models of all animals in nature requires massively scaling up existing
solutions. With this ultimate goal in mind we develop 3D-Fauna an approach that learns a …

Grm: Large gaussian reconstruction model for efficient 3d reconstruction and generation

Y Xu, Z Shi, W Yifan, H Chen, C Yang, S Peng… - arXiv preprint arXiv …, 2024 - arxiv.org
We introduce GRM, a large-scale reconstructor capable of recovering a 3D asset from
sparse-view images in around 0.1 s. GRM is a feed-forward transformer-based model that …