End-to-end autonomous driving: Challenges and frontiers

L Chen, P Wu, K Chitta, B Jaeger… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
The autonomous driving community has witnessed a rapid growth in approaches that
embrace an end-to-end algorithm framework, utilizing raw sensor input to generate vehicle …

Grid-centric traffic scenario perception for autonomous driving: A comprehensive review

Y Shi, K Jiang, J Li, Z Qian, J Wen, M Yang… - arXiv preprint arXiv …, 2023 - arxiv.org
Grid-centric perception is a crucial field for mobile robot perception and navigation.
Nonetheless, grid-centric perception is less prevalent than object-centric perception as …

Is sora a world simulator? a comprehensive survey on general world models and beyond

Z Zhu, X Wang, W Zhao, C Min, N Deng, M Dou… - arXiv preprint arXiv …, 2024 - arxiv.org
General world models represent a crucial pathway toward achieving Artificial General
Intelligence (AGI), serving as the cornerstone for various applications ranging from virtual …

Llm4drive: A survey of large language models for autonomous driving

Z Yang, X Jia, H Li, J Yan - … 2024 Workshop on Open-World Agents, 2023 - openreview.net
Autonomous driving technology, a catalyst for revolutionizing transportation and urban
mobility, has the tend to transition from rule-based systems to data-driven strategies …

Sledge: Synthesizing driving environments with generative models and rule-based traffic

K Chitta, D Dauner, A Geiger - European Conference on Computer Vision, 2025 - Springer
SLEDGE is the first generative simulator for vehicle motion planning trained on real-world
driving logs. Its core component is a learned model that is able to generate agent bounding …

Bevworld: A multimodal world model for autonomous driving via unified bev latent space

Y Zhang, S Gong, K Xiong, X Ye, X Tan, F Wang… - arXiv preprint arXiv …, 2024 - arxiv.org
World models are receiving increasing attention in autonomous driving for their ability to
predict potential future scenarios. In this paper, we present BEVWorld, a novel approach that …

Closed-loop visuomotor control with generative expectation for robotic manipulation

Q Bu, J Zeng, L Chen, Y Yang, G Zhou, J Yan… - arXiv preprint arXiv …, 2024 - arxiv.org
Despite significant progress in robotics and embodied AI in recent years, deploying robots
for long-horizon tasks remains a great challenge. Majority of prior arts adhere to an open …

Covla: Comprehensive vision-language-action dataset for autonomous driving

H Arai, K Miwa, K Sasaki, Y Yamaguchi… - arXiv preprint arXiv …, 2024 - arxiv.org
Autonomous driving, particularly navigating complex and unanticipated scenarios, demands
sophisticated reasoning and planning capabilities. While Multi-modal Large Language …

CRASH: Crash Recognition and Anticipation System Harnessing with Context-Aware and Temporal Focus Attentions

H Liao, H Sun, H Shen, C Wang, C Tian… - Proceedings of the …, 2024 - dl.acm.org
Accurately and promptly predicting accidents among surrounding traffic agents from camera
footage is crucial for the safety of autonomous vehicles (AVs). This task presents substantial …

Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability

S Gao, J Yang, L Chen, K Chitta, Y Qiu… - arXiv preprint arXiv …, 2024 - arxiv.org
World models can foresee the outcomes of different actions, which is of paramount
importance for autonomous driving. Nevertheless, existing driving world models still have …