Openoccupancy: A large scale benchmark for surrounding semantic occupancy perception X Wang, Z Zhu, W Xu, Y Zhang, Y Wei, X Chi, Y Ye, D Du, J Lu, X Wang ICCV-2023, 2023 | 120 | 2023 |
Mvster: Epipolar transformer for efficient multi-view stereo X Wang, Z Zhu, G Huang, F Qin, Y Ye, Y He, X Chi, X Wang ECCV-2022, 2022 | 99 | 2022 |
Drivedreamer: Towards real-world-driven world models for autonomous driving X Wang, Z Zhu, G Huang, X Chen, J Zhu, J Lu ECCV-2024, 2023 | 90 | 2023 |
On the Road with GPT-4V (ision): Explorations of Utilizing Visual-Language Model as Autonomous Driving Agent L Wen*, X Yang*, D Fu*, X Wang*, P Cai, X Li, MA Tao, Y Li, XU Linran, ... (Equal Contribution) ICLR 2024 Workshop on Large Language Model (LLM) Agents, 2024 | 67* | 2024 |
Is sora a world simulator? a comprehensive survey on general world models and beyond Z Zhu*, X Wang*, W Zhao*, C Min*, N Deng*, M Dou*, Y Wang*, B Shi, ... (Equal Contribution) arXiv preprint arXiv:2405.03520, 2024 | 23 | 2024 |
Drivedreamer-2: Llm-enhanced world models for diverse driving video generation G Zhao*, X Wang*, Z Zhu*, X Chen, G Huang, X Bao, X Wang (Equal Contribution) arXiv preprint arXiv:2403.06845, 2024 | 20 | 2024 |
Bridging Stereo Geometry and BEV Representation with Reliable Mutual Interaction for Semantic Scene Completion WZ B Li, Y Sun, Z Liang, D Du, Z Zhang, X Wang, Y Wang, X Jin IJCAI-2024, 2024 | 19* | 2024 |
Crafting monocular cues and velocity guidance for self-supervised multi-frame depth learning X Wang, Z Zhu, G Huang, X Chi, Y Ye, Z Chen, X Wang AAAI-2023, 2023 | 15 | 2023 |
Are we ready for vision-centric driving streaming perception? the asap benchmark X Wang, Z Zhu, Y Zhang, G Huang, Y Ye, W Xu, Z Chen, X Wang CVPR-2023, 2023 | 15 | 2023 |
Worlddreamer: Towards general world models for video generation via predicting masked tokens X Wang, Z Zhu, G Huang, B Wang, X Chen, J Lu arXiv preprint arXiv:2401.09985, 2024 | 14 | 2024 |
Liftedcl: Lifting contrastive learning for human-centric perception Z Chen, Q Li, X Wang, W Yang ICLR-2023, 2023 | 7 | 2023 |
Drivedreamer4d: World models are effective data machines for 4d driving scene representation G Zhao*, C Ni*, X Wang*, Z Zhu*, G Huang, X Chen, B Wang, Y Zhang, ... (Equal Contribution) arXiv preprint arXiv:2410.13571, 2024 | 2 | 2024 |
EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation X Wang, K Zhao, F Liu, J Wang, G Zhao, X Bao, Z Zhu, Y Zhang, X Wang arXiv preprint arXiv:2411.08380, 2024 | | 2024 |
A Multimodal Neural Network for Contact State Recognition During Probe Implantation into Skull Holes Y Song, X Wang, D Zhang 2023 IEEE 19th International Conference on Automation Science and …, 2023 | | 2023 |
Supplementary Material for DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving X Wang, Z Zhu, G Huang, B Wang, X Chen, J Lu | | |