Perceiver-actor: A multi-task transformer for robotic manipulation
Transformers have revolutionized vision and natural language processing with their ability to
scale with large datasets. But in robotic manipulation, data is both limited and expensive …
scale with large datasets. But in robotic manipulation, data is both limited and expensive …
Instruction-driven history-aware policies for robotic manipulations
In human environments, robots are expected to accomplish a variety of manipulation tasks
given simple natural language instructions. Yet, robotic manipulation is extremely …
given simple natural language instructions. Yet, robotic manipulation is extremely …
Calvin: A benchmark for language-conditioned policy learning for long-horizon robot manipulation tasks
General-purpose robots coexisting with humans in their environment must learn to relate
human language to their perceptions and actions to be useful in a range of daily tasks …
human language to their perceptions and actions to be useful in a range of daily tasks …
What matters in language conditioned robotic imitation learning over unstructured data
A long-standing goal in robotics is to build robots that can perform a wide range of daily
tasks from perceptions obtained with their onboard sensors and specified only via natural …
tasks from perceptions obtained with their onboard sensors and specified only via natural …
Grounding language with visual affordances over unstructured data
Recent works have shown that Large Language Models (LLMs) can be applied to ground
natural language to a wide variety of robot skills. However, in practice, learning multi-task …
natural language to a wide variety of robot skills. However, in practice, learning multi-task …
Concept2robot: Learning manipulation concepts from instructions and human demonstrations
We aim to endow a robot with the ability to learn manipulation concepts that link natural
language instructions to motor skills. Our goal is to learn a single multi-task policy that takes …
language instructions to motor skills. Our goal is to learn a single multi-task policy that takes …
Artificial intelligence foundation and pre-trained models: Fundamentals, applications, opportunities, and social impacts
A Kolides, A Nawaz, A Rathor, D Beeman… - … Modelling Practice and …, 2023 - Elsevier
With the emergence of foundation models (FMs) that are trained on large amounts of data at
scale and adaptable to a wide range of downstream applications, AI is experiencing a …
scale and adaptable to a wide range of downstream applications, AI is experiencing a …
Graspgpt: Leveraging semantic knowledge from a large language model for task-oriented grasping
Task-oriented grasping (TOG) refers to the problem of predicting grasps on an object that
enable subsequent manipulation tasks. To model the complex relationships between …
enable subsequent manipulation tasks. To model the complex relationships between …
Task-oriented grasp prediction with visual-language inputs
To perform household tasks, assistive robots receive commands in the form of user
language instructions for tool manipulation. The initial stage involves selecting the intended …
language instructions for tool manipulation. The initial stage involves selecting the intended …
A joint network for grasp detection conditioned on natural language commands
We consider the task of grasping a target object based on a natural language command
query. Previous work primarily focused on localizing the object given the query, which …
query. Previous work primarily focused on localizing the object given the query, which …