Survey on large language model-enhanced reinforcement learning: Concept, taxonomy, and methods
With extensive pretrained knowledge and high-level general capabilities, large language
models (LLMs) emerge as a promising avenue to augment reinforcement learning (RL) in …
models (LLMs) emerge as a promising avenue to augment reinforcement learning (RL) in …
The rise and potential of large language model based agents: A survey
For a long time, humanity has pursued artificial intelligence (AI) equivalent to or surpassing
the human level, with AI agents considered a promising vehicle for this pursuit. AI agents are …
the human level, with AI agents considered a promising vehicle for this pursuit. AI agents are …
Foundation models in robotics: Applications, challenges, and the future
We survey applications of pretrained foundation models in robotics. Traditional deep
learning models in robotics are trained on small datasets tailored for specific tasks, which …
learning models in robotics are trained on small datasets tailored for specific tasks, which …
Drivedreamer: Towards real-world-driven world models for autonomous driving
World models, especially in autonomous driving, are trending and drawing extensive
attention due to their capacity for comprehending driving environments. The established …
attention due to their capacity for comprehending driving environments. The established …
Manigaussian: Dynamic gaussian splatting for multi-task robotic manipulation
Performing language-conditioned robotic manipulation tasks in unstructured environments
is highly demanded for general intelligent robots. Conventional robotic manipulation …
is highly demanded for general intelligent robots. Conventional robotic manipulation …
Towards efficient llm grounding for embodied multi-agent collaboration
Grounding the reasoning ability of large language models (LLMs) for embodied tasks is
challenging due to the complexity of the physical world. Especially, LLM planning for multi …
challenging due to the complexity of the physical world. Especially, LLM planning for multi …
Policy adaptation via language optimization: Decomposing tasks for few-shot imitation
Learned language-conditioned robot policies often struggle to effectively adapt to new real-
world tasks even when pre-trained across a diverse set of instructions. We propose a novel …
world tasks even when pre-trained across a diverse set of instructions. We propose a novel …
Quar-vla: Vision-language-action model for quadruped robots
The important manifestation of robot intelligence is the ability to naturally interact and
autonomously make decisions. Traditional quadruped robot learning typically handles …
autonomously make decisions. Traditional quadruped robot learning typically handles …
Worlddreamer: Towards general world models for video generation via predicting masked tokens
World models play a crucial role in understanding and predicting the dynamics of the world,
which is essential for video generation. However, existing world models are confined to …
which is essential for video generation. However, existing world models are confined to …
Sora as an agi world model? a complete survey on text-to-video generation
Text-to-video generation marks a significant frontier in the rapidly evolving domain of
generative AI, integrating advancements in text-to-image synthesis, video captioning, and …
generative AI, integrating advancements in text-to-image synthesis, video captioning, and …