[HTML][HTML] Visual slam: What are the current trends and what to expect?

A Tourani, H Bavle, JL Sanchez-Lopez, H Voos - Sensors, 2022 - mdpi.com
In recent years, Simultaneous Localization and Mapping (SLAM) systems have shown
significant performance, accuracy, and efficiency gain. In this regard, Visual Simultaneous …

Depth anything: Unleashing the power of large-scale unlabeled data

L Yang, B Kang, Z Huang, X Xu… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract This work presents Depth Anything a highly practical solution for robust monocular
depth estimation. Without pursuing novel technical modules we aim to build a simple yet …

Pointodyssey: A large-scale synthetic dataset for long-term point tracking

Y Zheng, AW Harley, B Shen… - Proceedings of the …, 2023 - openaccess.thecvf.com
We introduce PointOdyssey, a large-scale synthetic dataset, and data generation framework,
for the training and evaluation of long-term fine-grained tracking algorithms. Our goal is to …

Nerf-slam: Real-time dense monocular slam with neural radiance fields

A Rosinol, JJ Leonard, L Carlone - 2023 IEEE/RSJ International …, 2023 - ieeexplore.ieee.org
We propose a novel geometric and photometric 3D mapping pipeline for accurate and real-
time scene reconstruction from casually taken monocular images. To achieve this, we …

Droid-slam: Deep visual slam for monocular, stereo, and rgb-d cameras

Z Teed, J Deng - Advances in neural information …, 2021 - proceedings.neurips.cc
We introduce DROID-SLAM, a new deep learning based SLAM system. DROID-SLAM
consists of recurrent iterative updates of camera pose and pixelwise depth through a Dense …

Unifying flow, stereo and depth estimation

H Xu, J Zhang, J Cai, H Rezatofighi… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org
We present a unified formulation and model for three motion and 3D perception tasks:
optical flow, rectified stereo matching and unrectified stereo depth estimation from posed …

Vision transformers for dense prediction

R Ranftl, A Bochkovskiy… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
We introduce dense prediction transformers, an architecture that leverages vision
transformers in place of convolutional networks as a backbone for dense prediction tasks …

[HTML][HTML] Review of visual simultaneous localization and mapping based on deep learning

Y Zhang, Y Wu, K Tong, H Chen, Y Yuan - remote sensing, 2023 - mdpi.com
Due to the limitations of LiDAR, such as its high cost, short service life and massive volume,
visual sensors with their lightweight and low cost are attracting more and more attention and …

Battle of the backbones: A large-scale comparison of pretrained models across computer vision tasks

M Goldblum, H Souri, R Ni, M Shu… - Advances in …, 2024 - proceedings.neurips.cc
Neural network based computer vision systems are typically built on a backbone, a
pretrained or randomly initialized feature extractor. Several years ago, the default option was …

Towards zero-shot scale-aware monocular depth estimation

V Guizilini, I Vasiljevic, D Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com
Monocular depth estimation is scale-ambiguous, and thus requires scale supervision to
produce metric predictions. Even so, the resulting models will be geometry-specific, with …