Flash3D: Feed-Forward Generalisable 3D Scene Reconstruction from a Single Image

S Bahmani, I Skorokhodov, A Siarohin… - arXiv preprint arXiv …, 2024 - arxiv.org

Modern text-to-video synthesis models demonstrate coherent, photorealistic generation of
complex videos from a text description. However, most existing models lack fine-grained …

被引用次数：19 相关文章所有 3 个版本

[PDF] arxiv.org

Splatt3r: Zero-shot gaussian splatting from uncalibrated image pairs

B Smart, C Zheng, I Laina, VA Prisacariu - arXiv preprint arXiv:2408.13912, 2024 - arxiv.org

In this paper, we introduce Splatt3R, a pose-free, feed-forward method for in-the-wild 3D
reconstruction and novel view synthesis from stereo pairs. Given uncalibrated natural …

被引用次数：11 相关文章所有 3 个版本

[PDF] arxiv.org

Reconx: Reconstruct any scene from sparse views with video diffusion model

F Liu, W Sun, H Wang, Y Wang, H Sun, J Ye… - arXiv preprint arXiv …, 2024 - arxiv.org

Advancements in 3D scene reconstruction have transformed 2D images from the real world
into 3D models, producing realistic 3D results from hundreds of input photos. Despite great …

被引用次数：10 相关文章所有 4 个版本

[PDF] arxiv.org

Mvsplat360: Feed-forward 360 scene synthesis from sparse views

Y Chen, C Zheng, H Xu, B Zhuang, A Vedaldi… - arXiv preprint arXiv …, 2024 - arxiv.org

We introduce MVSplat360, a feed-forward approach for 360 {\deg} novel view synthesis
(NVS) of diverse real-world scenes, using only sparse observations. This setting is …

被引用次数：5 相关文章所有 3 个版本

[PDF] arxiv.org

MultiDepth: Multi-Sample Priors for Refining Monocular Metric Depth Estimations in Indoor Scenes

S Byun, J Song, WS Chung - arXiv preprint arXiv:2411.01048, 2024 - arxiv.org

Monocular metric depth estimation (MMDE) is a crucial task to solve for indoor scene
reconstruction on edge devices. Despite this importance, existing models are sensitive to …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

GaussTR: Foundation Model-Aligned Gaussian Transformer for Self-Supervised 3D Spatial Understanding

H Jiang, L Liu, T Cheng, X Wang, T Lin, Z Su… - arXiv preprint arXiv …, 2024 - arxiv.org

3D Semantic Occupancy Prediction is fundamental for spatial understanding as it provides a
comprehensive semantic cognition of surrounding environments. However, prevalent …

相关文章所有 2 个版本

[PDF] arxiv.org

Novel View Synthesis with Pixel-Space Diffusion Models

N Elata, B Kawar, Y Ostrovsky-Berman… - arXiv preprint arXiv …, 2024 - arxiv.org

Synthesizing a novel view from a single input image is a challenging task. Traditionally, this
task was approached by estimating scene depth, warping, and inpainting, with machine …

相关文章所有 2 个版本

[PDF] arxiv.org

AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers

S Bahmani, I Skorokhodov, G Qian, A Siarohin… - arXiv preprint arXiv …, 2024 - arxiv.org

Numerous works have recently integrated 3D camera control into foundational text-to-video
models, but the resulting camera control is often imprecise, and video generation quality …

相关文章所有 2 个版本

[PDF] arxiv.org

Generative Densification: Learning to Densify Gaussians for High-Fidelity Generalizable 3D Reconstruction

S Nam, X Sun, G Kang, Y Lee, S Oh, E Park - arXiv preprint arXiv …, 2024 - arxiv.org

Generalized feed-forward Gaussian models have achieved significant progress in sparse-
view 3D reconstruction by leveraging prior knowledge from large multi-view datasets …

相关文章所有 2 个版本

[PDF] arxiv.org

A Lesson in Splats: Teacher-Guided Diffusion for 3D Gaussian Splats Generation with 2D Supervision

C Peng, I Sobol, M Tomizuka, K Keutzer, C Xu… - arXiv preprint arXiv …, 2024 - arxiv.org

We introduce a diffusion model for Gaussian Splats, SplatDiffusion, to enable generation of
three-dimensional structures from single images, addressing the ill-posed nature of lifting 2D …

相关文章所有 2 个版本