Active visual information gathering for vision-language navigation

J Gu, E Stefani, Q Wu, J Thomason… - arXiv preprint arXiv …, 2022 - arxiv.org

A long-term goal of AI research is to build intelligent agents that can communicate with
humans in natural language, perceive the environment, and perform real-world tasks. Vision …

被引用次数：97 相关文章所有 6 个版本

[PDF] jair.org Full View

Core challenges in embodied vision-language planning

J Francis, N Kitamura, F Labelle, X Lu, I Navarro… - Journal of Artificial …, 2022 - jair.org

Recent advances in the areas of multimodal machine learning and artificial intelligence (AI)
have led to the development of challenging tasks at the intersection of Computer Vision …

被引用次数：32 相关文章所有 14 个版本

[PDF] thecvf.com

Scaling data generation in vision-and-language navigation

Z Wang, J Li, Y Hong, Y Wang, Q Wu… - Proceedings of the …, 2023 - openaccess.thecvf.com

Recent research in language-guided visual navigation has demonstrated a significant
demand for the diversity of traversable environments and the quantity of supervision for …

被引用次数：28 相关文章所有 6 个版本

[PDF] thecvf.com

Bird's-Eye-View Scene Graph for Vision-Language Navigation

R Liu, X Wang, W Wang… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Abstract Vision-language navigation (VLN), which entails an agent to navigate 3D
environments following human instructions, has shown great advances. However, current …

被引用次数：20 相关文章所有 5 个版本

[PDF] google.com

Looking beyond single images for weakly supervised semantic segmentation learning

W Wang, G Sun, L Van Gool - IEEE Transactions on Pattern …, 2022 - ieeexplore.ieee.org

This article studies the problem of learning weakly supervised semantic segmentation
(WSSS) from image-level supervision only. Current popular solutions leverage object …

被引用次数：65 相关文章所有 9 个版本

[PDF] thecvf.com

Hop: History-and-order aware pre-training for vision-and-language navigation

Y Qiao, Y Qi, Y Hong, Z Yu… - Proceedings of the …, 2022 - openaccess.thecvf.com

Pre-training has been adopted in a few of recent works for Vision-and-Language Navigation
(VLN). However, previous pre-training methods for VLN either lack the ability to predict …

被引用次数：65 相关文章所有 7 个版本

[PDF] thecvf.com

Dreamwalker: Mental planning for continuous vision-language navigation

H Wang, W Liang, L Van Gool… - Proceedings of the …, 2023 - openaccess.thecvf.com

VLN-CE is a recently released embodied task, where AI agents need to navigate a freely
traversable environment to reach a distant target location, given language instructions. It …

被引用次数：15 相关文章所有 6 个版本

[PDF] thecvf.com

Structured scene memory for vision-language navigation

H Wang, W Wang, W Liang… - Proceedings of the …, 2021 - openaccess.thecvf.com

Recently, numerous algorithms have been developed to tackle the problem of vision-
language navigation (VLN), ie, entailing an agent to navigate 3D environments through …

被引用次数：105 相关文章所有 9 个版本

[PDF] thecvf.com

Adaptive zone-aware hierarchical planner for vision-language navigation

C Gao, X Peng, M Yan, H Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract The task of Vision-Language Navigation (VLN) is for an embodied agent to reach
the global goal according to the instruction. Essentially, during navigation, a series of sub …

被引用次数：21 相关文章所有 6 个版本

[PDF] arxiv.org

Target-driven structured transformer planner for vision-language navigation

Y Zhao, J Chen, C Gao, W Wang, L Yang… - Proceedings of the 30th …, 2022 - dl.acm.org

Vision-language navigation is the task of directing an embodied agent to navigate in 3D
scenes with natural language instructions. For the agent, inferring the long-term navigation …

被引用次数：46 相关文章所有 5 个版本