Drivedreamer-2: Llm-enhanced world models for diverse driving video generation

G Zhao, X Wang, Z Zhu, X Chen, G Huang… - arXiv preprint arXiv …, 2024 - arxiv.org
World models have demonstrated superiority in autonomous driving, particularly in the
generation of multi-view driving videos. However, significant challenges still exist in
generating customized driving videos. In this paper, we propose DriveDreamer-2, which
builds upon the framework of DriveDreamer and incorporates a Large Language Model
(LLM) to generate user-defined driving videos. Specifically, an LLM interface is initially
incorporated to convert a user's query into agent trajectories. Subsequently, a HDMap …

[引用][C] DriveDreamer-2: LLM-enhanced world models for diverse driving video generation. arXiv. 2024

G Zhao, X Wang, Z Zhu, X Chen, G Huang, X Bao…
以上显示的是最相近的搜索结果。 查看全部搜索结果