A survey on integration of large language models with intelligent robots
In recent years, the integration of large language models (LLMs) has revolutionized the field
of robotics, enabling robots to communicate, understand, and reason with human-like …
of robotics, enabling robots to communicate, understand, and reason with human-like …
Open x-embodiment: Robotic learning datasets and rt-x models
Large, high-capacity models trained on diverse datasets have shown remarkable successes
on efficiently tackling downstream applications. In domains from NLP to Computer Vision …
on efficiently tackling downstream applications. In domains from NLP to Computer Vision …
Voxposer: Composable 3d value maps for robotic manipulation with language models
Large language models (LLMs) are shown to possess a wealth of actionable knowledge that
can be extracted for robot manipulation in the form of reasoning and planning. Despite the …
can be extracted for robot manipulation in the form of reasoning and planning. Despite the …
[HTML][HTML] Rt-2: Vision-language-action models transfer web knowledge to robotic control
We study how vision-language models trained on Internet-scale data can be incorporated
directly into end-to-end robotic control to boost generalization and enable emergent …
directly into end-to-end robotic control to boost generalization and enable emergent …
Foundation models in robotics: Applications, challenges, and the future
We survey applications of pretrained foundation models in robotics. Traditional deep
learning models in robotics are trained on small datasets tailored for specific tasks, which …
learning models in robotics are trained on small datasets tailored for specific tasks, which …
Open X-Embodiment: Robotic Learning Datasets and RT-X Models : Open X-Embodiment Collaboration0
Large, high-capacity models trained on diverse datasets have shown remarkable successes
on efficiently tackling downstream applications. In domains from NLP to Computer Vision …
on efficiently tackling downstream applications. In domains from NLP to Computer Vision …
Octo: An open-source generalist robot policy
Large policies pretrained on diverse robot datasets have the potential to transform robotic
learning: instead of training new policies from scratch, such generalist robot policies may be …
learning: instead of training new policies from scratch, such generalist robot policies may be …
Where are we in the search for an artificial visual cortex for embodied intelligence?
We present the largest and most comprehensive empirical study of pre-trained visual
representations (PVRs) or visual 'foundation models' for Embodied AI. First, we curate …
representations (PVRs) or visual 'foundation models' for Embodied AI. First, we curate …
Tptu: Task planning and tool usage of large language model-based ai agents
With recent advancements in natural language processing, Large Language Models (LLMs)
have emerged as powerful tools for various real-world applications. Despite their prowess …
have emerged as powerful tools for various real-world applications. Despite their prowess …
Moka: Open-vocabulary robotic manipulation through mark-based visual prompting
Open-vocabulary generalization requires robotic systems to perform tasks involving complex
and diverse environments and task goals. While the recent advances in vision language …
and diverse environments and task goals. While the recent advances in vision language …