Segment anything

A Kirillov, E Mintun, N Ravi, H Mao… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract We introduce the Segment Anything (SA) project: a new task, model, and dataset for
image segmentation. Using our efficient model in a data collection loop, we built the largest …

Zero-1-to-3: Zero-shot one image to 3d object

R Liu, R Wu, B Van Hoorick… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract We introduce Zero-1-to-3, a framework for changing the camera viewpoint of an
object given just a single RGB image. To perform novel view synthesis in this …

One-2-3-45: Any single image to 3d mesh in 45 seconds without per-shape optimization

M Liu, C Xu, H Jin, L Chen… - Advances in Neural …, 2024 - proceedings.neurips.cc
Single image 3D reconstruction is an important but challenging task that requires extensive
knowledge of our natural world. Many existing methods solve this problem by optimizing a …

Objaverse-xl: A universe of 10m+ 3d objects

M Deitke, R Liu, M Wallingford, H Ngo… - Advances in …, 2024 - proceedings.neurips.cc
Natural language processing and 2D vision models have attained remarkable proficiency on
many tasks primarily by escalating the scale of training data. However, 3D vision tasks have …

Mvdream: Multi-view diffusion for 3d generation

Y Shi, P Wang, J Ye, M Long, K Li, X Yang - arXiv preprint arXiv …, 2023 - arxiv.org
We propose MVDream, a multi-view diffusion model that is able to generate geometrically
consistent multi-view images from a given text prompt. By leveraging image diffusion models …

One-2-3-45++: Fast single image to 3d objects with consistent multi-view generation and 3d diffusion

M Liu, R Shi, L Chen, Z Zhang, C Xu… - Proceedings of the …, 2024 - openaccess.thecvf.com
Recent advancements in open-world 3D object generation have been remarkable with
image-to-3D methods offering superior fine-grained control over their text-to-3D …

Humans in 4D: Reconstructing and tracking humans with transformers

S Goel, G Pavlakos, J Rajasegaran… - Proceedings of the …, 2023 - openaccess.thecvf.com
We present an approach to reconstruct humans and track them over time. At the core of our
approach, we propose a fully" transformerized" version of a network for human mesh …

Triplane meets gaussian splatting: Fast and generalizable single-view 3d reconstruction with transformers

ZX Zou, Z Yu, YC Guo, Y Li, D Liang… - Proceedings of the …, 2024 - openaccess.thecvf.com
Recent advancements in 3D reconstruction from single images have been driven by the
evolution of generative models. Prominent among these are methods based on Score …

Lrm: Large reconstruction model for single image to 3d

Y Hong, K Zhang, J Gu, S Bi, Y Zhou, D Liu… - arXiv preprint arXiv …, 2023 - arxiv.org
We propose the first Large Reconstruction Model (LRM) that predicts the 3D model of an
object from a single input image within just 5 seconds. In contrast to many previous methods …

Let 2d diffusion model know 3d-consistency for robust text-to-3d generation

J Seo, W Jang, MS Kwak, H Kim, J Ko, J Kim… - arXiv preprint arXiv …, 2023 - arxiv.org
Text-to-3D generation has shown rapid progress in recent days with the advent of score
distillation, a methodology of using pretrained text-to-2D diffusion models to optimize neural …