Effective and general evaluation for instruction conditioned navigation using dynamic time warping

J Krantz, E Wijmans, A Majumdar, D Batra… - Computer Vision–ECCV …, 2020 - Springer

We develop a language-guided navigation task set in a continuous 3D environment where
agents must execute low-level actions to follow natural language navigation directions. By …

被引用次数：230 相关文章所有 6 个版本

[PDF] mlr.press

Sim-to-real transfer for vision-and-language navigation

P Anderson, A Shrivastava, J Truong… - … on Robot Learning, 2021 - proceedings.mlr.press

We study the challenging problem of releasing a robot in a previously unseen environment,
and having it follow unconstrained natural language navigation instructions. Recent work on …

被引用次数：104 相关文章所有 6 个版本

[PDF] neurips.cc

Soat: A scene-and object-aware transformer for vision-and-language navigation

A Moudgil, A Majumdar, H Agrawal… - Advances in Neural …, 2021 - proceedings.neurips.cc

Natural language instructions for visual navigation often use scene descriptions (eg,
bedroom) and object references (eg, green chairs) to provide a breadcrumb trail to a goal …

被引用次数：50 相关文章所有 6 个版本

[PDF] arxiv.org

Babywalk: Going farther in vision-and-language navigation by taking baby steps

W Zhu, H Hu, J Chen, Z Deng, V Jain, E Ie… - arXiv preprint arXiv …, 2020 - arxiv.org

Learning to follow instructions is of fundamental importance to autonomous agents for vision-
and-language navigation (VLN). In this paper, we study how an agent can navigate long …

被引用次数：75 相关文章所有 7 个版本

[PDF] arxiv.org

Multimodal attention networks for low-level vision-and-language navigation

F Landi, L Baraldi, M Cornia, M Corsini… - Computer vision and …, 2021 - Elsevier

Abstract Vision-and-Language Navigation (VLN) is a challenging task in which an agent
needs to follow a language-specified path to reach a target destination. The goal gets even …

被引用次数：65 相关文章所有 9 个版本

[PDF] arxiv.org

Sub-instruction aware vision-and-language navigation

Y Hong, C Rodriguez-Opazo, Q Wu, S Gould - arXiv preprint arXiv …, 2020 - arxiv.org

Vision-and-language navigation requires an agent to navigate through a real 3D
environment following natural language instructions. Despite significant advances, few …

被引用次数：60 相关文章所有 7 个版本

[PDF] arxiv.org

Improving cross-modal alignment in vision language navigation via syntactic information

J Li, H Tan, M Bansal - arXiv preprint arXiv:2104.09580, 2021 - arxiv.org

Vision language navigation is the task that requires an agent to navigate through a 3D
environment based on natural language instructions. One key challenge in this task is to …

被引用次数：35 相关文章所有 5 个版本

[PDF] neurips.cc

Curriculum learning for vision-and-language navigation

J Zhang, J Fan, J Peng - Advances in Neural Information …, 2021 - proceedings.neurips.cc

Abstract Vision-and-Language Navigation (VLN) is a task where an agent navigates in an
embodied indoor environment under human instructions. Previous works ignore the …

被引用次数：19 相关文章所有 7 个版本

[PDF] arxiv.org

Learning to stop: A simple yet effective approach to urban vision-language navigation

J Xiang, XE Wang, WY Wang - arXiv preprint arXiv:2009.13112, 2020 - arxiv.org

Vision-and-Language Navigation (VLN) is a natural language grounding task where an
agent learns to follow language instructions and navigate to specified destinations in real …

被引用次数：27 相关文章所有 3 个版本

[PDF] arxiv.org

Find a way forward: a language-guided semantic map navigator

Z Wang, M Li, M Wu, MF Moens… - arXiv preprint arXiv …, 2022 - arxiv.org

In this paper, we introduce the map-language navigation task where an agent executes
natural language instructions and moves to the target position based only on a given 3D …

被引用次数：6 相关文章所有 3 个版本