TRAM: Global Trajectory and Motion of 3D Humans from in-the-wild Videos

Y Wang, Z Wang, L Liu, K Daniilidis - European Conference on Computer …, 2025 - Springer
We propose TRAM, a two-stage method to reconstruct a human's global trajectory and
motion from in-the-wild videos. TRAM robustifies SLAM to recover the camera motion in the …

Scube: Instant large-scale scene reconstruction using voxsplats

X Ren, Y Lu, H Liang, Z Wu, H Ling, M Chen… - arXiv preprint arXiv …, 2024 - arxiv.org
We present SCube, a novel method for reconstructing large-scale 3D scenes (geometry,
appearance, and semantics) from a sparse set of posed images. Our method encodes …

CoSEC: A coaxial stereo event camera dataset for autonomous driving

S Peng, H Zhou, H Dong, Z Shi, H Liu, Y Duan… - arXiv preprint arXiv …, 2024 - arxiv.org
Conventional frame camera is the mainstream sensor of the autonomous driving scene
perception, while it is limited in adverse conditions, such as low light. Event camera with …

Towards robust monocular depth estimation in non-lambertian surfaces

J Zhang, J Li, Y Huang, Y Wang, J Zheng… - arXiv preprint arXiv …, 2024 - arxiv.org
In the field of monocular depth estimation (MDE), many models with excellent zero-shot
performance in general scenes emerge recently. However, these methods often fail in …

FusionSense: Bridging Common Sense, Vision, and Touch for Robust Sparse-View Reconstruction

I Fang, K Shi, X He, S Tan, Y Wang, H Zhao… - arXiv preprint arXiv …, 2024 - arxiv.org
Humans effortlessly integrate common-sense knowledge with sensory input from vision and
touch to understand their surroundings. Emulating this capability, we introduce …

Towards In-context Environment Sensing for Mobile Augmented Reality

Y Zhao, A Ganj, T Guo - Proceedings of the 30th Annual International …, 2024 - dl.acm.org
Environment sensing is a fundamental task in mobile augmented reality (AR). However, on-
device sensing and computing resources often limit mobile AR sensing capability, making …

Next Best Sense: Guiding Vision and Touch with FisherRF for 3D Gaussian Splatting

M Strong, B Lei, A Swann, W Jiang, K Daniilidis… - arXiv preprint arXiv …, 2024 - arxiv.org
We propose a framework for active next best view and touch selection for robotic
manipulators using 3D Gaussian Splatting (3DGS). 3DGS is emerging as a useful explicit …

Reactive Collision Avoidance for Safe Agile Navigation

A Saviolo, N Picello, R Verma, G Loianno - arXiv preprint arXiv …, 2024 - arxiv.org
Reactive collision avoidance is essential for agile robots navigating complex and dynamic
environments, enabling real-time obstacle response. However, this task is inherently …

Foundation Models Meet Low-Cost Sensors: Test-Time Adaptation for Rescaling Disparity for Zero-Shot Metric Depth Estimation

R Marsal, A Chapoutot, P Xu, D Filliat - arXiv preprint arXiv:2412.14103, 2024 - arxiv.org
The recent development of foundation models for monocular depth estimation such as
Depth Anything paved the way to zero-shot monocular depth estimation. Since it returns an …

Map-Free Visual Relocalization Enhanced by Instance Knowledge and Depth Knowledge

M Xiao, R Chen, H Luo, F Zhao, J Wang… - arXiv preprint arXiv …, 2024 - arxiv.org
Map-free relocalization technology is crucial for applications in autonomous navigation and
augmented reality, but relying on pre-built maps is often impractical. It faces significant …