Dpft: Dual perspective fusion transformer for camera-radar-based object detection

F Fent, A Palffy, H Caesar - arXiv preprint arXiv:2404.03015, 2024 - arxiv.org
The perception of autonomous vehicles has to be efficient, robust, and cost-effective.
However, cameras are not robust against severe weather conditions, lidar sensors are …

SparseFusion: Efficient Sparse Multi-Modal Fusion Framework for Long-Range 3D Perception

Y Li, H Li, Z Huang, H Chang, N Wang - arXiv preprint arXiv:2403.10036, 2024 - arxiv.org
Multi-modal 3D object detection has exhibited significant progress in recent years. However,
most existing methods can hardly scale to long-range scenarios due to their reliance on …

Camera-based Online Vectorized HD Map Construction with Incomplete Observation

H Liu, F Chang, C Liu, Y Lu… - IEEE Robotics and …, 2024 - ieeexplore.ieee.org
Camera-based online map construction focuses on learning map elements from surround-
view images. Distinguished with previous methods that rely on complete observations, we …

[PDF][PDF] Cross-Modal Transformers for Robust Multi-Modal BEV Detection

C Kang, X Zhou, C Ying, W Shang, X Wei, Y Dong - robodrive-24.github.io
In this paper, we elaborate on the practical application and demonstration of Cross-Modal
Transformer (CMT) for Track 5–Robust Multi-Modal BEV Detection, in the 2024 RoboDrive …