TRAM: Global Trajectory and Motion of 3D Humans from in-the-wild Videos
We propose TRAM, a two-stage method to reconstruct a human's global trajectory and
motion from in-the-wild videos. TRAM robustifies SLAM to recover the camera motion in the …
motion from in-the-wild videos. TRAM robustifies SLAM to recover the camera motion in the …
Scube: Instant large-scale scene reconstruction using voxsplats
We present SCube, a novel method for reconstructing large-scale 3D scenes (geometry,
appearance, and semantics) from a sparse set of posed images. Our method encodes …
appearance, and semantics) from a sparse set of posed images. Our method encodes …
CoSEC: A coaxial stereo event camera dataset for autonomous driving
Conventional frame camera is the mainstream sensor of the autonomous driving scene
perception, while it is limited in adverse conditions, such as low light. Event camera with …
perception, while it is limited in adverse conditions, such as low light. Event camera with …
Towards robust monocular depth estimation in non-lambertian surfaces
In the field of monocular depth estimation (MDE), many models with excellent zero-shot
performance in general scenes emerge recently. However, these methods often fail in …
performance in general scenes emerge recently. However, these methods often fail in …
FusionSense: Bridging Common Sense, Vision, and Touch for Robust Sparse-View Reconstruction
Humans effortlessly integrate common-sense knowledge with sensory input from vision and
touch to understand their surroundings. Emulating this capability, we introduce …
touch to understand their surroundings. Emulating this capability, we introduce …
Towards In-context Environment Sensing for Mobile Augmented Reality
Environment sensing is a fundamental task in mobile augmented reality (AR). However, on-
device sensing and computing resources often limit mobile AR sensing capability, making …
device sensing and computing resources often limit mobile AR sensing capability, making …
Next Best Sense: Guiding Vision and Touch with FisherRF for 3D Gaussian Splatting
We propose a framework for active next best view and touch selection for robotic
manipulators using 3D Gaussian Splatting (3DGS). 3DGS is emerging as a useful explicit …
manipulators using 3D Gaussian Splatting (3DGS). 3DGS is emerging as a useful explicit …
Reactive Collision Avoidance for Safe Agile Navigation
Reactive collision avoidance is essential for agile robots navigating complex and dynamic
environments, enabling real-time obstacle response. However, this task is inherently …
environments, enabling real-time obstacle response. However, this task is inherently …
Foundation Models Meet Low-Cost Sensors: Test-Time Adaptation for Rescaling Disparity for Zero-Shot Metric Depth Estimation
The recent development of foundation models for monocular depth estimation such as
Depth Anything paved the way to zero-shot monocular depth estimation. Since it returns an …
Depth Anything paved the way to zero-shot monocular depth estimation. Since it returns an …
Map-Free Visual Relocalization Enhanced by Instance Knowledge and Depth Knowledge
M Xiao, R Chen, H Luo, F Zhao, J Wang… - arXiv preprint arXiv …, 2024 - arxiv.org
Map-free relocalization technology is crucial for applications in autonomous navigation and
augmented reality, but relying on pre-built maps is often impractical. It faces significant …
augmented reality, but relying on pre-built maps is often impractical. It faces significant …