Pseudo-lidar from visual depth estimation: Bridging the gap in 3d object detection for autonomous...

J Mao, S Shi, X Wang, H Li - International Journal of Computer Vision, 2023 - Springer

Autonomous driving, in recent years, has been receiving increasing attention for its potential
to relieve drivers' burdens and improve the safety of driving. In modern autonomous driving …

被引用次数：91 相关文章所有 8 个版本

[PDF] ieee.org

Delving into the devils of bird's-eye-view perception: A review, evaluation and recipe

H Li, C Sima, J Dai, W Wang, L Lu… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org

Learning powerful representations in bird's-eye-view (BEV) for perception tasks is trending
and drawing extensive attention both from industry and academia. Conventional …

被引用次数：81 相关文章所有 9 个版本

[PDF] thecvf.com

Depth anything: Unleashing the power of large-scale unlabeled data

L Yang, B Kang, Z Huang, X Xu… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract This work presents Depth Anything a highly practical solution for robust monocular
depth estimation. Without pursuing novel technical modules we aim to build a simple yet …

被引用次数：124 相关文章所有 6 个版本

[PDF] aaai.org

Bevdepth: Acquisition of reliable depth for multi-view 3d object detection

Y Li, Z Ge, G Yu, J Yang, Z Wang, Y Shi… - Proceedings of the AAAI …, 2023 - ojs.aaai.org

In this research, we propose a new 3D object detector with a trustworthy depth estimation,
dubbed BEVDepth, for camera-based Bird's-Eye-View~(BEV) 3D object detection. Our work …

被引用次数：391 相关文章所有 5 个版本

[PDF] arxiv.org

Bevformer: Learning bird's-eye-view representation from multi-camera images via spatiotemporal transformers

Z Li, W Wang, H Li, E Xie, C Sima, T Lu, Y Qiao… - European conference on …, 2022 - Springer

Abstract 3D visual perception tasks, including 3D detection and map segmentation based on
multi-camera images, are essential for autonomous driving systems. In this work, we present …

被引用次数：820 相关文章所有 9 个版本

[PDF] thecvf.com

Bevformer v2: Adapting modern image backbones to bird's-eye-view recognition via perspective supervision

C Yang, Y Chen, H Tian, C Tao, X Zhu… - Proceedings of the …, 2023 - openaccess.thecvf.com

We present a novel bird's-eye-view (BEV) detector with perspective supervision, which
converges faster and better suits modern image backbones. Existing state-of-the-art BEV …

被引用次数：151 相关文章所有 9 个版本

[PDF] thecvf.com

Surroundocc: Multi-camera 3d occupancy prediction for autonomous driving

Y Wei, L Zhao, W Zheng, Z Zhu… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract 3D scene understanding plays a vital role in vision-based autonomous driving.
While most existing methods focus on 3D object detection, they have difficulty describing …

被引用次数：112 相关文章所有 5 个版本

[PDF] neurips.cc

Unifying voxel-based representation with transformer for 3d object detection

Y Li, Y Chen, X Qi, Z Li, J Sun… - Advances in Neural …, 2022 - proceedings.neurips.cc

In this work, we present a unified framework for multi-modality 3D object detection, named
UVTR. The proposed method aims to unify multi-modality representations in the voxel space …

被引用次数：187 相关文章所有 6 个版本

[PDF] thecvf.com

Cross-view transformers for real-time map-view semantic segmentation

B Zhou, P Krähenbühl - … of the IEEE/CVF conference on …, 2022 - openaccess.thecvf.com

We present cross-view transformers, an efficient attention-based model for map-view
semantic segmentation from multiple cameras. Our architecture implicitly learns a mapping …

被引用次数：213 相关文章所有 12 个版本

[PDF] arxiv.org

Bytetrack: Multi-object tracking by associating every detection box

Y Zhang, P Sun, Y Jiang, D Yu, F Weng, Z Yuan… - European conference on …, 2022 - Springer

Multi-object tracking (MOT) aims at estimating bounding boxes and identities of objects in
videos. Most methods obtain identities by associating detection boxes whose scores are …

被引用次数：1162 相关文章所有 12 个版本