Core challenges in embodied vision-language planning
Recent advances in the areas of multimodal machine learning and artificial intelligence (AI)
have led to the development of challenging tasks at the intersection of Computer Vision …
have led to the development of challenging tasks at the intersection of Computer Vision …
Habitat-web: Learning embodied object-search strategies from human demonstrations at scale
R Ramrakhya, E Undersander… - Proceedings of the …, 2022 - openaccess.thecvf.com
We present a large-scale study of imitating human demonstrations on tasks that require a
virtual robot to search for objects in new environments-(1) ObjectGoal Navigation (eg'find & …
virtual robot to search for objects in new environments-(1) ObjectGoal Navigation (eg'find & …
Soundspaces: Audio-visual navigation in 3d environments
Moving around in the world is naturally a multisensory experience, but today's embodied
agents are deaf—restricted to solely their visual perception of the environment. We introduce …
agents are deaf—restricted to solely their visual perception of the environment. We introduce …
Auxiliary tasks and exploration enable objectgoal navigation
Abstract ObjectGoal Navigation (ObjectNav) is an embodied task wherein agents are to
navigate to an object instance in an unseen environment. Prior works have shown that end …
navigate to an object instance in an unseen environment. Prior works have shown that end …
Excalibur: Encouraging and evaluating embodied exploration
Experience precedes understanding. Humans constantly explore and learn about their
environment out of curiosity, gather information, and update their models of the world. On the …
environment out of curiosity, gather information, and update their models of the world. On the …
A cordial sync: Going beyond marginal policies for multi-agent embodied tasks
Autonomous agents must learn to collaborate. It is not scalable to develop a new centralized
agent every time a task's difficulty outpaces a single agent's abilities. While multi-agent …
agent every time a task's difficulty outpaces a single agent's abilities. While multi-agent …
Visual language navigation: A survey and open challenges
SM Park, YG Kim - Artificial Intelligence Review, 2023 - Springer
With the recent development of deep learning, AI models are widely used in various
domains. AI models show good performance for definite tasks such as image classification …
domains. AI models show good performance for definite tasks such as image classification …
Perceiving the world: Question-guided reinforcement learning for text-based games
Text-based games provide an interactive way to study natural language processing. While
deep reinforcement learning has shown effectiveness in developing the game playing …
deep reinforcement learning has shown effectiveness in developing the game playing …
Light-weight probing of unsupervised representations for reinforcement learning
Unsupervised visual representation learning offers the opportunity to leverage large corpora
of unlabeled trajectories to form useful visual representations, which can benefit the training …
of unlabeled trajectories to form useful visual representations, which can benefit the training …
Ical: Continual learning of multimodal agents by transforming trajectories into actionable insights
Large-scale generative language and vision-language models (LLMs and VLMs) excel in
few-shot in-context learning for decision making and instruction following. However, they …
few-shot in-context learning for decision making and instruction following. However, they …