Shapellm: Universal 3d object understanding for embodied interaction
This paper presents ShapeLLM, the first 3D Multimodal Large Language Model (LLM)
designed for embodied interaction, exploring a universal 3D object understanding with 3D …
designed for embodied interaction, exploring a universal 3D object understanding with 3D …
Manipllm: Embodied multimodal large language model for object-centric robotic manipulation
Robot manipulation relies on accurately predicting contact points and end-effector directions
to ensure successful operation. However learning-based robot manipulation trained on a …
to ensure successful operation. However learning-based robot manipulation trained on a …
Chaineddiffuser: Unifying trajectory diffusion and keypose prediction for robotic manipulation
We present ChainedDiffuser, a policy architecture that unifies action keypose prediction and
trajectory diffusion generation for learning robot manipulation from demonstrations. Our …
trajectory diffusion generation for learning robot manipulation from demonstrations. Our …
Nap: Neural 3d articulated object prior
Abstract We propose Neural 3D Articulated object Prior (NAP), the first 3D deep generative
model to synthesize 3D articulated object models. Despite the extensive research on …
model to synthesize 3D articulated object models. Despite the extensive research on …
Robo-abc: Affordance generalization beyond categories via semantic correspondence for robot manipulation
Enabling robotic manipulation that generalizes to out-of-distribution scenes is a crucial step
toward open-world embodied intelligence. For human beings, this ability is rooted in the …
toward open-world embodied intelligence. For human beings, this ability is rooted in the …
ARNOLD: A benchmark for language-grounded task learning with continuous states in realistic 3D scenes
Understanding the continuous states of objects is essential for task learning and planning in
the real world. However, most existing task learning benchmarks assume discrete (eg …
the real world. However, most existing task learning benchmarks assume discrete (eg …
Ram: Retrieval-based affordance transfer for generalizable zero-shot robotic manipulation
This work proposes a retrieve-and-transfer framework for zero-shot robotic manipulation,
dubbed RAM, featuring generalizability across various objects, environments, and …
dubbed RAM, featuring generalizability across various objects, environments, and …
Unidoormanip: Learning universal door manipulation policy over large-scale and diverse door manipulation environments
Learning a universal manipulation policy encompassing doors with diverse categories,
geometries and mechanisms, is crucial for future embodied agents to effectively work in …
geometries and mechanisms, is crucial for future embodied agents to effectively work in …
Open-vocabulary affordance detection using knowledge distillation and text-point correlation
Affordance detection presents intricate challenges and has a wide range of robotic
applications. Previous works have faced limitations such as the complexities of 3D object …
applications. Previous works have faced limitations such as the complexities of 3D object …
UniGarmentManip: A Unified Framework for Category-Level Garment Manipulation via Dense Visual Correspondence
Garment manipulation (eg unfolding folding and hanging clothes) is essential for future
robots to accomplish home-assistant tasks while highly challenging due to the diversity of …
robots to accomplish home-assistant tasks while highly challenging due to the diversity of …