Replan: Robotic replanning with perception and language models

A Mei, GN Zhu, H Zhang, Z Gan - IEEE Robotics and …, 2024 - ieeexplore.ieee.org

Large language models (LLMs) have gained increasing popularity in robotic task planning
due to their exceptional abilities in text analytics and generation, as well as their broad …

被引用次数：5 相关文章所有 3 个版本

[PDF] arxiv.org

Closed-loop open-vocabulary mobile manipulation with gpt-4v

P Zhi, Z Zhang, M Han, Z Zhang, Z Li, Z Jiao… - arXiv preprint arXiv …, 2024 - arxiv.org

Autonomous robot navigation and manipulation in open environments require reasoning
and replanning with closed-loop feedback. We present COME-robot, the first closed-loop …

被引用次数：16 相关文章所有 2 个版本

[PDF] arxiv.org

ORGANA: A Robotic Assistant for Automated Chemistry Experimentation and Characterization

K Darvish, M Skreta, Y Zhao, N Yoshikawa… - arXiv preprint arXiv …, 2024 - arxiv.org

Chemistry experimentation is often resource-and labor-intensive. Despite the many benefits
incurred by the integration of advanced and special-purpose lab equipment, many aspects …

被引用次数：13 相关文章所有 2 个版本

[PDF] arxiv.org

AHA: A Vision-Language-Model for Detecting and Reasoning Over Failures in Robotic Manipulation

J Duan, W Pumacay, N Kumar, YR Wang… - arXiv preprint arXiv …, 2024 - arxiv.org

Robotic manipulation in open-world settings requires not only task execution but also the
ability to detect and learn from failures. While recent advances in vision-language models …

被引用次数：1 相关文章所有 4 个版本

[PDF] arxiv.org

VLMimic: Vision Language Models are Visual Imitation Learner for Fine-grained Actions

G Chen, M Wang, YMT Cui, H Lu, T Zhou… - arXiv preprint arXiv …, 2024 - arxiv.org

Visual imitation learning (VIL) provides an efficient and intuitive strategy for robotic systems
to acquire novel skills. Recent advancements in Vision Language Models (VLMs) have …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

Guiding Long-Horizon Task and Motion Planning with Vision Language Models

Z Yang, C Garrett, D Fox, T Lozano-Pérez… - arXiv preprint arXiv …, 2024 - arxiv.org

Vision-Language Models (VLM) can generate plausible high-level plans when prompted
with a goal, the context, an image of the scene, and any planning constraints. However …

被引用次数：1 相关文章所有 4 个版本

[PDF] arxiv.org

Sensorimotor Attention and Language-based Regressions in Shared Latent Variables for Integrating Robot Motion Learning and LLM

K Suzuki, T Ogata - … on Intelligent Robots and Systems (IROS), 2024 - ieeexplore.ieee.org

In recent years, studies have been actively conducted on combining large language models
(LLM) and robotics; however, most have not considered end-to-end feed-back in the robot …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

GameVLM: A Decision-making Framework for Robotic Task Planning Based on Visual Language Models and Zero-sum Games

A Mei, J Wang, GN Zhu, Z Gan - arXiv preprint arXiv:2405.13751, 2024 - arxiv.org

With their prominent scene understanding and reasoning capabilities, pre-trained visual-
language models (VLMs) such as GPT-4V have attracted increasing attention in robotic task …

被引用次数：3 相关文章所有 3 个版本

[PDF] arxiv.org

SuFIA: Language-Guided Augmented Dexterity for Robotic Surgical Assistants

M Moghani, L Doorenbos, WCH Panitch… - arXiv preprint arXiv …, 2024 - arxiv.org

In this work, we present SuFIA, the first framework for natural language-guided augmented
dexterity for robotic surgical assistants. SuFIA incorporates the strong reasoning capabilities …

被引用次数：5 相关文章所有 2 个版本

[PDF] arxiv.org

Creative Problem Solving in Large Language and Vision Models--What Would it Take?

L Nair, E Gizzi, J Sinapov - arXiv preprint arXiv:2405.01453, 2024 - arxiv.org

In this paper, we discuss approaches for integrating Computational Creativity (CC) with
research in large language and vision models (LLVMs) to address a key limitation of these …