Tc4d: Trajectory-conditioned text-to-4d generation

S Bahmani, X Liu, W Yifan, I Skorokhodov… - … on Computer Vision, 2025 - Springer
Recent techniques for text-to-4D generation synthesize dynamic 3D scenes using
supervision from pre-trained text-to-video models. However, existing representations, such …

Vd3d: Taming large video diffusion transformers for 3d camera control

S Bahmani, I Skorokhodov, A Siarohin… - arXiv preprint arXiv …, 2024 - arxiv.org
Modern text-to-video synthesis models demonstrate coherent, photorealistic generation of
complex videos from a text description. However, most existing models lack fine-grained …

Animatabledreamer: Text-guided non-rigid 3d model generation and reconstruction with canonical score distillation

X Wang, Y Wang, J Ye, F Sun, Z Wang, L Wang… - … on Computer Vision, 2025 - Springer
Advances in 3D generation have facilitated sequential 3D model generation (aka 4D
generation), yet its application for animatable objects with large motion remains scarce. Our …

Alignment of diffusion models: Fundamentals, challenges, and future

B Liu, S Shao, B Li, L Bai, Z Xu, H Xiong, J Kwok… - arXiv preprint arXiv …, 2024 - arxiv.org
Diffusion models have emerged as the leading paradigm in generative modeling, excelling
in various applications. Despite their success, these models often misalign with human …

Vividdreamer: Invariant score distillation for hyper-realistic text-to-3d generation

W Zhuo, F Ma, H Fan, Y Yang - European Conference on Computer Vision, 2025 - Springer
Abstract This paper presents Invariant Score Distillation (ISD), a novel method for high-
fidelity text-to-3D generation. ISD aims to tackle the over-saturation and over-smoothing …

Dimensionx: Create any 3d and 4d scenes from a single image with controllable video diffusion

W Sun, S Chen, F Liu, Z Chen, Y Duan, J Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
In this paper, we introduce\textbf {DimensionX}, a framework designed to generate
photorealistic 3D and 4D scenes from just a single image with video diffusion. Our approach …

fMRI-3D: A Comprehensive Dataset for Enhancing fMRI-based 3D Reconstruction

J Gao, Y Fu, Y Wang, X Qian, J Feng, Y Fu - arXiv preprint arXiv …, 2024 - arxiv.org
Reconstructing 3D visuals from functional Magnetic Resonance Imaging (fMRI) data,
introduced as Recon3DMind in our conference work, is of significant interest to both …

GradualReality: Enhancing Physical Object Interaction in Virtual Reality via Interaction State-Aware Blending

HA Seo, J Yi, R Balan, Y Lee - Proceedings of the 37th Annual ACM …, 2024 - dl.acm.org
We present GradualReality, a novel interface enabling a Cross Reality experience that
includes gradual interaction with physical objects in a virtual environment and supports both …

Baking Gaussian Splatting into Diffusion Denoiser for Fast and Scalable Single-stage Image-to-3D Generation

Y Cai, H Zhang, K Zhang, Y Liang, M Ren… - arXiv preprint arXiv …, 2024 - arxiv.org
Existing feed-forward image-to-3D methods mainly rely on 2D multi-view diffusion models
that cannot guarantee 3D consistency. These methods easily collapse when changing the …

AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers

S Bahmani, I Skorokhodov, G Qian, A Siarohin… - arXiv preprint arXiv …, 2024 - arxiv.org
Numerous works have recently integrated 3D camera control into foundational text-to-video
models, but the resulting camera control is often imprecise, and video generation quality …