Tartanair: A dataset to push the limits of visual slam

A Tourani, H Bavle, JL Sanchez-Lopez, H Voos - Sensors, 2022 - mdpi.com

In recent years, Simultaneous Localization and Mapping (SLAM) systems have shown
significant performance, accuracy, and efficiency gain. In this regard, Visual Simultaneous …

被引用次数：58 相关文章所有 14 个版本

[PDF] thecvf.com

Depth anything: Unleashing the power of large-scale unlabeled data

L Yang, B Kang, Z Huang, X Xu… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract This work presents Depth Anything a highly practical solution for robust monocular
depth estimation. Without pursuing novel technical modules we aim to build a simple yet …

被引用次数：201 相关文章所有 6 个版本

[PDF] thecvf.com

Pointodyssey: A large-scale synthetic dataset for long-term point tracking

Y Zheng, AW Harley, B Shen… - Proceedings of the …, 2023 - openaccess.thecvf.com

We introduce PointOdyssey, a large-scale synthetic dataset, and data generation framework,
for the training and evaluation of long-term fine-grained tracking algorithms. Our goal is to …

被引用次数：59 相关文章所有 5 个版本

[PDF] arxiv.org

Nerf-slam: Real-time dense monocular slam with neural radiance fields

A Rosinol, JJ Leonard, L Carlone - 2023 IEEE/RSJ International …, 2023 - ieeexplore.ieee.org

We propose a novel geometric and photometric 3D mapping pipeline for accurate and real-
time scene reconstruction from casually taken monocular images. To achieve this, we …

被引用次数：200 相关文章所有 4 个版本

[PDF] neurips.cc

Droid-slam: Deep visual slam for monocular, stereo, and rgb-d cameras

Z Teed, J Deng - Advances in neural information …, 2021 - proceedings.neurips.cc

We introduce DROID-SLAM, a new deep learning based SLAM system. DROID-SLAM
consists of recurrent iterative updates of camera pose and pixelwise depth through a Dense …

被引用次数：399 相关文章所有 7 个版本

[PDF] arxiv.org

Unifying flow, stereo and depth estimation

H Xu, J Zhang, J Cai, H Rezatofighi… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org

We present a unified formulation and model for three motion and 3D perception tasks:
optical flow, rectified stereo matching and unrectified stereo depth estimation from posed …

被引用次数：127 相关文章所有 15 个版本

[PDF] thecvf.com

Vision transformers for dense prediction

R Ranftl, A Bochkovskiy… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com

We introduce dense prediction transformers, an architecture that leverages vision
transformers in place of convolutional networks as a backbone for dense prediction tasks …

被引用次数：1631 相关文章所有 9 个版本

[HTML] mdpi.com

[HTML][HTML] Review of visual simultaneous localization and mapping based on deep learning

Y Zhang, Y Wu, K Tong, H Chen, Y Yuan - remote sensing, 2023 - mdpi.com

Due to the limitations of LiDAR, such as its high cost, short service life and massive volume,
visual sensors with their lightweight and low cost are attracting more and more attention and …

被引用次数：14 相关文章所有 4 个版本

[PDF] neurips.cc

Battle of the backbones: A large-scale comparison of pretrained models across computer vision tasks

M Goldblum, H Souri, R Ni, M Shu… - Advances in …, 2024 - proceedings.neurips.cc

Neural network based computer vision systems are typically built on a backbone, a
pretrained or randomly initialized feature extractor. Several years ago, the default option was …

被引用次数：31 相关文章所有 5 个版本

[PDF] thecvf.com

Towards zero-shot scale-aware monocular depth estimation

V Guizilini, I Vasiljevic, D Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com

Monocular depth estimation is scale-ambiguous, and thus requires scale supervision to
produce metric predictions. Even so, the resulting models will be geometry-specific, with …

被引用次数：33 相关文章所有 5 个版本