From language to goals: Inverse reinforcement learning for vision-based instruction following

J Ji, T Qiu, B Chen, B Zhang, H Lou, K Wang… - arXiv preprint arXiv …, 2023 - arxiv.org

AI alignment aims to make AI systems behave in line with human intentions and values. As
AI systems grow more capable, the potential large-scale risks associated with misaligned AI …

被引用次数：128 相关文章所有 3 个版本

[PDF] ieee.org

Chatgpt for robotics: Design principles and model abilities

SH Vemprala, R Bonatti, A Bucker, A Kapoor - IEEE Access, 2024 - ieeexplore.ieee.org

This paper presents an experimental study regarding the use of OpenAI's ChatGPT for
robotics applications. We outline a strategy that combines design principles for prompt …

被引用次数：359 相关文章所有 7 个版本

[PDF] arxiv.org

Open problems and fundamental limitations of reinforcement learning from human feedback

S Casper, X Davies, C Shi, TK Gilbert… - arXiv preprint arXiv …, 2023 - arxiv.org

Reinforcement learning from human feedback (RLHF) is a technique for training AI systems
to align with human goals. RLHF has emerged as the central method used to finetune state …

被引用次数：280 相关文章所有 6 个版本

[PDF] arxiv.org

Language to rewards for robotic skill synthesis

W Yu, N Gileadi, C Fu, S Kirmani, KH Lee… - arXiv preprint arXiv …, 2023 - arxiv.org

Large language models (LLMs) have demonstrated exciting progress in acquiring diverse
new capabilities through in-context learning, ranging from logical reasoning to code-writing …

被引用次数：176 相关文章所有 4 个版本

[PDF] mlr.press

Language models as zero-shot planners: Extracting actionable knowledge for embodied agents

W Huang, P Abbeel, D Pathak… - … conference on machine …, 2022 - proceedings.mlr.press

Can world knowledge learned by large language models (LLMs) be used to act in
interactive environments? In this paper, we investigate the possibility of grounding high-level …

被引用次数：813 相关文章所有 5 个版本

[PDF] arxiv.org

On the opportunities and risks of foundation models

R Bommasani, DA Hudson, E Adeli, R Altman… - arXiv preprint arXiv …, 2021 - arxiv.org

AI is undergoing a paradigm shift with the rise of models (eg, BERT, DALL-E, GPT-3) that are
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …

被引用次数：3618 相关文章所有 2 个版本

[PDF] arxiv.org

Vision-and-language navigation: A survey of tasks, methods, and future directions

J Gu, E Stefani, Q Wu, J Thomason… - arXiv preprint arXiv …, 2022 - arxiv.org

A long-term goal of AI research is to build intelligent agents that can communicate with
humans in natural language, perceive the environment, and perform real-world tasks. Vision …

被引用次数：104 相关文章所有 6 个版本

[PDF] mlr.press

Learning language-conditioned robot behavior from offline data and crowd-sourced annotation

S Nair, E Mitchell, K Chen… - Conference on Robot …, 2022 - proceedings.mlr.press

We study the problem of learning a range of vision-based manipulation tasks from a large
offline dataset of robot interaction. In order to accomplish this, humans need easy and …

被引用次数：129 相关文章所有 5 个版本

[PDF] mlr.press

Behavior: Benchmark for everyday household activities in virtual, interactive, and ecological environments

S Srivastava, C Li, M Lingelbach… - … on robot learning, 2022 - proceedings.mlr.press

We introduce BEHAVIOR, a benchmark for embodied AI with 100 activities in simulation,
spanning a range of everyday household chores such as cleaning, maintenance, and food …

被引用次数：131 相关文章所有 4 个版本

[PDF] neurips.cc

Search on the replay buffer: Bridging planning and reinforcement learning

B Eysenbach, RR Salakhutdinov… - Advances in neural …, 2019 - proceedings.neurips.cc

The history of learning for control has been an exciting back and forth between two broad
classes of algorithms: planning and reinforcement learning. Planning algorithms effectively …

被引用次数：322 相关文章所有 10 个版本