Less is more: Generating grounded navigation instructions from landmarks

Y Jiang, A Gupta, Z Zhang, G Wang… - arXiv preprint …, 2022 - authors.library.caltech.edu

Prompt-based learning has emerged as a successful paradigm in natural language
processing, where a single general-purpose language model can be instructed to perform …

被引用次数：165 相关文章所有 6 个版本

[PDF] aaai.org

Navgpt: Explicit reasoning in vision-and-language navigation with large language models

G Zhou, Y Hong, Q Wu - Proceedings of the AAAI Conference on …, 2024 - ojs.aaai.org

Trained with an unprecedented scale of data, large language models (LLMs) like ChatGPT
and GPT-4 exhibit the emergence of significant reasoning abilities from model scaling. Such …

被引用次数：77 相关文章所有 4 个版本

[PDF] thecvf.com

Scaling data generation in vision-and-language navigation

Z Wang, J Li, Y Hong, Y Wang, Q Wu… - Proceedings of the …, 2023 - openaccess.thecvf.com

Recent research in language-guided visual navigation has demonstrated a significant
demand for the diversity of traversable environments and the quantity of supervision for …

被引用次数：36 相关文章所有 6 个版本

[PDF] neurips.cc

Panogen: Text-conditioned panoramic environment generation for vision-and-language navigation

J Li, M Bansal - Advances in Neural Information Processing …, 2024 - proceedings.neurips.cc

Abstract Vision-and-Language Navigation requires the agent to follow language instructions
to navigate through 3D environments. One main challenge in Vision-and-Language …

被引用次数：24 相关文章所有 5 个版本

[PDF] thecvf.com

Lana: A language-capable navigator for instruction following and generation

X Wang, W Wang, J Shao… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Recently, visual-language navigation (VLN)--entailing robot agents to follow navigation
instructions--has shown great advance. However, existing literature put most emphasis on …

被引用次数：29 相关文章所有 6 个版本

[PDF] arxiv.org

Discuss before moving: Visual language navigation via multi-expert discussions

Y Long, X Li, W Cai, H Dong - 2024 IEEE International …, 2024 - ieeexplore.ieee.org

Visual language navigation (VLN) is an embodied task demanding a wide range of skills
encompassing understanding, perception, and planning. For such a multifaceted challenge …

被引用次数：21 相关文章所有 3 个版本

[PDF] thecvf.com

Bevbert: Multimodal map pre-training for language-guided navigation

D An, Y Qi, Y Li, Y Huang, L Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Large-scale pre-training has shown promising results on the vision-and-language
navigation (VLN) task. However, most existing pre-training methods employ discrete …

被引用次数：32 相关文章所有 3 个版本

[PDF] thecvf.com

Learning vision-and-language navigation from youtube videos

K Lin, P Chen, D Huang, TH Li… - Proceedings of the …, 2023 - openaccess.thecvf.com

Vision-and-language navigation (VLN) requires an embodied agent to navigate in realistic
3D environments using natural language instructions. Existing VLN methods suffer from …

被引用次数：10 相关文章所有 6 个版本

[PDF] arxiv.org

Etpnav: Evolving topological planning for vision-language navigation in continuous environments

D An, H Wang, W Wang, Z Wang… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

Vision-language navigation is a task that requires an agent to follow instructions to navigate
in environments. It becomes increasingly crucial in the field of embodied AI, with potential …

被引用次数：35 相关文章所有 6 个版本

[PDF] thecvf.com

Object-goal visual navigation via effective exploration of relations among historical navigation states

H Du, L Li, Z Huang, X Yu - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com

Object-goal visual navigation aims at steering an agent toward an object via a series of
moving steps. Previous works mainly focus on learning informative visual representations for …

被引用次数：18 相关文章所有 5 个版本