Interactive natural language processing

Z Wang, G Zhang, K Yang, N Shi, W Zhou… - arXiv preprint arXiv …, 2023 - arxiv.org
Interactive Natural Language Processing (iNLP) has emerged as a novel paradigm within
the field of NLP, aimed at addressing limitations in existing frameworks while aligning with …

Open-ended instructable embodied agents with memory-augmented large language models

G Sarch, Y Wu, MJ Tarr, K Fragkiadaki - arXiv preprint arXiv:2310.15127, 2023 - arxiv.org
Pre-trained and frozen LLMs can effectively map simple scene re-arrangement instructions
to programs over a robot's visuomotor functions through appropriate few-shot example …

Egocentric planning for scalable embodied task achievement

X Liu, H Palacios, C Muise - Advances in Neural …, 2023 - proceedings.neurips.cc
Embodied agents face significant challenges when tasked with performing actions in diverse
environments, particularly in generalizing across object types and executing suitable actions …

Ical: Continual learning of multimodal agents by transforming trajectories into actionable insights

G Sarch, L Jang, MJ Tarr, WW Cohen, K Marino… - arXiv preprint arXiv …, 2024 - arxiv.org
Large-scale generative language and vision-language models (LLMs and VLMs) excel in
few-shot in-context learning for decision making and instruction following. However, they …

Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models

Y Zhang, Z Ma, J Li, Y Qiao, Z Wang, J Chai… - arXiv preprint arXiv …, 2024 - arxiv.org
Vision-and-Language Navigation (VLN) has gained increasing attention over recent years
and many approaches have emerged to advance their development. The remarkable …

HELPER-X: A Unified Instructable Embodied Agent to Tackle Four Interactive Vision-Language Domains with Memory-Augmented Language Models

G Sarch, S Somani, R Kapoor, MJ Tarr… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent research on instructable agents has used memory-augmented Large Language
Models (LLMs) as task planners, a technique that retrieves language-program examples …

CLIPGraphs: Multimodal Graph Networks to Infer Object-Room Affinities

A Agrawal, R Arora, A Datta, S Banerjee… - 2023 32nd IEEE …, 2023 - ieeexplore.ieee.org
This paper introduces a novel method for determining the best room to place an object in, for
embodied scene rearrangement. While state-of-the-art approaches rely on large language …

MSI-Agent: Incorporating Multi-Scale Insight into Embodied Agents for Superior Planning and Decision-Making

D Fu, B Qi, Y Gao, C Jiang, G Dong, B Zhou - arXiv preprint arXiv …, 2024 - arxiv.org
Long-term memory is significant for agents, in which insights play a crucial role. However,
the emergence of irrelevant insight and the lack of general insight can greatly undermine the …

Abstract meaning representation for grounded human-robot communication

C Bonial, J Foresta, NC Fung, C Hayes… - Proceedings of the …, 2023 - aclanthology.org
To collaborate effectively in physically situated tasks, robots must be able to ground
concepts in natural language to the physical objects in the environment as well as their own …

[PDF][PDF] Seagull: An embodied agent for instruction following through situated dialog

Y Zhang, J Yang, K Yu, Y Dai, S Storks, Y Bao, J Pan… - 2023 - assets.amazon.science
The growing demand for advanced AI necessitates the development of an intelligent agent
capable of perceiving, reasoning, acting, and communicating within an embodied …