3D object detection for autonomous driving: A comprehensive survey
Autonomous driving, in recent years, has been receiving increasing attention for its potential
to relieve drivers' burdens and improve the safety of driving. In modern autonomous driving …
to relieve drivers' burdens and improve the safety of driving. In modern autonomous driving …
Delving into the devils of bird's-eye-view perception: A review, evaluation and recipe
Learning powerful representations in bird's-eye-view (BEV) for perception tasks is trending
and drawing extensive attention both from industry and academia. Conventional …
and drawing extensive attention both from industry and academia. Conventional …
Depth anything: Unleashing the power of large-scale unlabeled data
Abstract This work presents Depth Anything a highly practical solution for robust monocular
depth estimation. Without pursuing novel technical modules we aim to build a simple yet …
depth estimation. Without pursuing novel technical modules we aim to build a simple yet …
Bevdepth: Acquisition of reliable depth for multi-view 3d object detection
In this research, we propose a new 3D object detector with a trustworthy depth estimation,
dubbed BEVDepth, for camera-based Bird's-Eye-View~(BEV) 3D object detection. Our work …
dubbed BEVDepth, for camera-based Bird's-Eye-View~(BEV) 3D object detection. Our work …
Bevformer: Learning bird's-eye-view representation from multi-camera images via spatiotemporal transformers
Abstract 3D visual perception tasks, including 3D detection and map segmentation based on
multi-camera images, are essential for autonomous driving systems. In this work, we present …
multi-camera images, are essential for autonomous driving systems. In this work, we present …
Bevformer v2: Adapting modern image backbones to bird's-eye-view recognition via perspective supervision
We present a novel bird's-eye-view (BEV) detector with perspective supervision, which
converges faster and better suits modern image backbones. Existing state-of-the-art BEV …
converges faster and better suits modern image backbones. Existing state-of-the-art BEV …
Surroundocc: Multi-camera 3d occupancy prediction for autonomous driving
Abstract 3D scene understanding plays a vital role in vision-based autonomous driving.
While most existing methods focus on 3D object detection, they have difficulty describing …
While most existing methods focus on 3D object detection, they have difficulty describing …
Unifying voxel-based representation with transformer for 3d object detection
In this work, we present a unified framework for multi-modality 3D object detection, named
UVTR. The proposed method aims to unify multi-modality representations in the voxel space …
UVTR. The proposed method aims to unify multi-modality representations in the voxel space …
Cross-view transformers for real-time map-view semantic segmentation
B Zhou, P Krähenbühl - … of the IEEE/CVF conference on …, 2022 - openaccess.thecvf.com
We present cross-view transformers, an efficient attention-based model for map-view
semantic segmentation from multiple cameras. Our architecture implicitly learns a mapping …
semantic segmentation from multiple cameras. Our architecture implicitly learns a mapping …
Bytetrack: Multi-object tracking by associating every detection box
Multi-object tracking (MOT) aims at estimating bounding boxes and identities of objects in
videos. Most methods obtain identities by associating detection boxes whose scores are …
videos. Most methods obtain identities by associating detection boxes whose scores are …