Prospective role of foundation models in advancing autonomous vehicles

J Wu, B Gao, J Gao, J Yu, H Chu, Q Yu, X Gong… - Research, 2024 - spj.science.org
With the development of artificial intelligence and breakthroughs in deep learning, large-
scale foundation models (FMs), such as generative pre-trained transformer (GPT), Sora, etc …

Drivedreamer4d: World models are effective data machines for 4d driving scene representation

G Zhao, C Ni, X Wang, Z Zhu, X Zhang, Y Wang… - arXiv preprint arXiv …, 2024 - arxiv.org
Closed-loop simulation is essential for advancing end-to-end autonomous driving systems.
Contemporary sensor simulation methods, such as NeRF and 3DGS, rely predominantly on …

Drivedreamer-2: Llm-enhanced world models for diverse driving video generation

G Zhao, X Wang, Z Zhu, X Chen, G Huang… - arXiv preprint arXiv …, 2024 - arxiv.org
World models have demonstrated superiority in autonomous driving, particularly in the
generation of multi-view driving videos. However, significant challenges still exist in …

World models for autonomous driving: An initial survey

Y Guan, H Liao, Z Li, J Hu, R Yuan, Y Li… - IEEE Transactions …, 2024 - ieeexplore.ieee.org
In the rapidly evolving landscape of autonomous driving, the capability to accurately predict
future events and assess their implications is paramount for both safety and efficiency …

Vbench++: Comprehensive and versatile benchmark suite for video generative models

Z Huang, F Zhang, X Xu, Y He, J Yu, Z Dong… - arXiv preprint arXiv …, 2024 - arxiv.org
Video generation has witnessed significant advancements, yet evaluating these models
remains a challenge. A comprehensive evaluation benchmark for video generation is …

Scenario-based Accelerated Testing for SOTIF in Autonomous Driving: A Review

L Tang, R Wang, Z Liu, Y Liang, Y Niu… - IEEE Internet of …, 2024 - ieeexplore.ieee.org
The development of intelligent driving systems has drawn significant attention to enhancing
the safety of autonomous vehicles and their intended functionality. Despite this, current …

PredToken: Predicting Unknown Tokens and Beyond with Coarse-to-Fine Iterative Decoding

X Nie, H Jin, Y Yan, X Chen… - Proceedings of the …, 2024 - openaccess.thecvf.com
Predictive learning models which aim to predict future frames based on past observations
are crucial to constructing world models. These models need to maintain low-level …

Pre-trained Visual Dynamics Representations for Efficient Policy Learning

H Luo, B Zhou, Z Lu - European Conference on Computer Vision, 2025 - Springer
Abstract Pre-training for Reinforcement Learning (RL) with purely video data is a valuable
yet challenging problem. Although in-the-wild videos are readily available and inhere a vast …

EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation

X Wang, K Zhao, F Liu, J Wang, G Zhao, X Bao… - arXiv preprint arXiv …, 2024 - arxiv.org
Video generation has emerged as a promising tool for world simulation, leveraging visual
data to replicate real-world environments. Within this context, egocentric video generation …

Worldgpt: Empowering llm as multimodal world model

Z Ge, H Huang, M Zhou, J Li, G Wang, S Tang… - arXiv preprint arXiv …, 2024 - arxiv.org
World models are progressively being employed across diverse fields, extending from basic
environment simulation to complex scenario construction. However, existing models are …