A survey of embodied ai: From simulators to research tasks
There has been an emerging paradigm shift from the era of “internet AI” to “embodied AI,”
where AI algorithms and agents no longer learn from datasets of images, videos or text …
where AI algorithms and agents no longer learn from datasets of images, videos or text …
Cross-modal map learning for vision and language navigation
G Georgakis, K Schmeckpeper… - Proceedings of the …, 2022 - openaccess.thecvf.com
We consider the problem of Vision-and-Language Navigation (VLN). The majority of current
methods for VLN are trained end-to-end using either unstructured memory such as LSTM, or …
methods for VLN are trained end-to-end using either unstructured memory such as LSTM, or …
Visual navigation with spatial attention
This work focuses on object goal visual navigation, aiming at finding the location of an object
from a given class, where in each step the agent is provided with an egocentric RGB image …
from a given class, where in each step the agent is provided with an egocentric RGB image …
3d-aware object goal navigation via simultaneous exploration and identification
Object goal navigation (ObjectNav) in unseen environments is a fundamental task for
Embodied AI. Agents in existing works learn ObjectNav policies based on 2D maps, scene …
Embodied AI. Agents in existing works learn ObjectNav policies based on 2D maps, scene …
Semantic mapnet: Building allocentric semantic maps and representations from egocentric views
We study the task of semantic mapping–specifically, an embodied agent (a robot or an
egocentric AI assistant) is given a tour of a new environment and asked to build an …
egocentric AI assistant) is given a tour of a new environment and asked to build an …
Uncertainty-driven planner for exploration and navigation
G Georgakis, B Bucher, A Arapin… - … on Robotics and …, 2022 - ieeexplore.ieee.org
We consider the problems of exploration and pointgoal navigation in previously unseen
environments, where the spatial complexity of indoor scenes and partial observability …
environments, where the spatial complexity of indoor scenes and partial observability …
Robust-EQA: robust learning for embodied question answering with noisy labels
Embodied question answering (EQA) is a recently emerged research field in which an agent
is asked to answer the user's questions by exploring the environment and collecting visual …
is asked to answer the user's questions by exploring the environment and collecting visual …
A survey of visual navigation: From geometry to embodied AI
T Zhang, X Hu, J Xiao, G Zhang - Engineering Applications of Artificial …, 2022 - Elsevier
The capacity to extract information and comprehend an unseen environment is critical for
mobile robots to navigate. Few surveys has mentioned the combinatorial-non-optimality …
mobile robots to navigate. Few surveys has mentioned the combinatorial-non-optimality …
Transformer-based vision-language alignment for robot navigation and question answering
The task of robot navigation and question answering, which is also known as Embodied
Question Answering (EQA), places its emphasis on empowering agents to actively explore …
Question Answering (EQA), places its emphasis on empowering agents to actively explore …
Depth and video segmentation based visual attention for embodied question answering
Embodied Question Answering (EQA) is a newly defined research area where an agent is
required to answer the user's questions by exploring the real-world environment. It has …
required to answer the user's questions by exploring the real-world environment. It has …