A survey of embodied ai: From simulators to research tasks
There has been an emerging paradigm shift from the era of “internet AI” to “embodied AI,”
where AI algorithms and agents no longer learn from datasets of images, videos or text …
where AI algorithms and agents no longer learn from datasets of images, videos or text …
Evaluation of socially-aware robot navigation
As mobile robots are increasingly introduced into our daily lives, it grows ever more
imperative that these robots navigate with and among people in a safe and socially …
imperative that these robots navigate with and among people in a safe and socially …
Visual language maps for robot navigation
Grounding language to the visual observations of a navigating agent can be performed
using off-the-shelf visual-language models pretrained on Internet-scale data (eg, image …
using off-the-shelf visual-language models pretrained on Internet-scale data (eg, image …
Q-transformer: Scalable offline reinforcement learning via autoregressive q-functions
In this work, we present a scalable reinforcement learning method for training multi-task
policies from large offline datasets that can leverage both human demonstrations and …
policies from large offline datasets that can leverage both human demonstrations and …
Interactive language: Talking to robots in real time
We present a framework for building interactive, real-time, natural language-instructable
robots in the real world, and we open source related assets (dataset, environment …
robots in the real world, and we open source related assets (dataset, environment …
Habitat 2.0: Training home assistants to rearrange their habitat
Abstract We introduce Habitat 2.0 (H2. 0), a simulation platform for training virtual robots in
interactive 3D environments and complex physics-enabled scenarios. We make …
interactive 3D environments and complex physics-enabled scenarios. We make …
Rvt: Robotic view transformer for 3d object manipulation
For 3D object manipulation, methods that build an explicit 3D representation perform better
than those relying only on camera images. But using explicit 3D representations like voxels …
than those relying only on camera images. But using explicit 3D representations like voxels …
Navigating to objects in the real world
Semantic navigation is necessary to deploy mobile robots in uncontrolled environments
such as homes or hospitals. Many learning-based approaches have been proposed in …
such as homes or hospitals. Many learning-based approaches have been proposed in …
Habitat-matterport 3d dataset (hm3d): 1000 large-scale 3d environments for embodied ai
We present the Habitat-Matterport 3D (HM3D) dataset. HM3D is a large-scale dataset of
1,000 building-scale 3D reconstructions from a diverse set of real-world locations. Each …
1,000 building-scale 3D reconstructions from a diverse set of real-world locations. Each …
History aware multimodal transformer for vision-and-language navigation
Vision-and-language navigation (VLN) aims to build autonomous visual agents that follow
instructions and navigate in real scenes. To remember previously visited locations and …
instructions and navigate in real scenes. To remember previously visited locations and …