EgoExoLearn: A Dataset for Bridging Asynchronous Ego-and Exo-centric View of Procedural Activities in Real World

Y Huang, G Chen, J Xu, M Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com
Being able to map the activities of others into one's own point of view is one fundamental
human skill even from a very early age. Taking a step toward understanding this human …

HELPER-X: A Unified Instructable Embodied Agent to Tackle Four Interactive Vision-Language Domains with Memory-Augmented Language Models

G Sarch, S Somani, R Kapoor, MJ Tarr… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent research on instructable agents has used memory-augmented Large Language
Models (LLMs) as task planners, a technique that retrieves language-program examples …