Ai alignment: A comprehensive survey

J Ji, T Qiu, B Chen, B Zhang, H Lou, K Wang… - arXiv preprint arXiv …, 2023 - arxiv.org
AI alignment aims to make AI systems behave in line with human intentions and values. As
AI systems grow more capable, the potential large-scale risks associated with misaligned AI …

Chatgpt for robotics: Design principles and model abilities

SH Vemprala, R Bonatti, A Bucker, A Kapoor - IEEE Access, 2024 - ieeexplore.ieee.org
This paper presents an experimental study regarding the use of OpenAI's ChatGPT for
robotics applications. We outline a strategy that combines design principles for prompt …

Open problems and fundamental limitations of reinforcement learning from human feedback

S Casper, X Davies, C Shi, TK Gilbert… - arXiv preprint arXiv …, 2023 - arxiv.org
Reinforcement learning from human feedback (RLHF) is a technique for training AI systems
to align with human goals. RLHF has emerged as the central method used to finetune state …

Language to rewards for robotic skill synthesis

W Yu, N Gileadi, C Fu, S Kirmani, KH Lee… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs) have demonstrated exciting progress in acquiring diverse
new capabilities through in-context learning, ranging from logical reasoning to code-writing …

Language models as zero-shot planners: Extracting actionable knowledge for embodied agents

W Huang, P Abbeel, D Pathak… - … conference on machine …, 2022 - proceedings.mlr.press
Can world knowledge learned by large language models (LLMs) be used to act in
interactive environments? In this paper, we investigate the possibility of grounding high-level …

On the opportunities and risks of foundation models

R Bommasani, DA Hudson, E Adeli, R Altman… - arXiv preprint arXiv …, 2021 - arxiv.org
AI is undergoing a paradigm shift with the rise of models (eg, BERT, DALL-E, GPT-3) that are
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …

Vision-and-language navigation: A survey of tasks, methods, and future directions

J Gu, E Stefani, Q Wu, J Thomason… - arXiv preprint arXiv …, 2022 - arxiv.org
A long-term goal of AI research is to build intelligent agents that can communicate with
humans in natural language, perceive the environment, and perform real-world tasks. Vision …

Learning language-conditioned robot behavior from offline data and crowd-sourced annotation

S Nair, E Mitchell, K Chen… - Conference on Robot …, 2022 - proceedings.mlr.press
We study the problem of learning a range of vision-based manipulation tasks from a large
offline dataset of robot interaction. In order to accomplish this, humans need easy and …

Behavior: Benchmark for everyday household activities in virtual, interactive, and ecological environments

S Srivastava, C Li, M Lingelbach… - … on robot learning, 2022 - proceedings.mlr.press
We introduce BEHAVIOR, a benchmark for embodied AI with 100 activities in simulation,
spanning a range of everyday household chores such as cleaning, maintenance, and food …

Search on the replay buffer: Bridging planning and reinforcement learning

B Eysenbach, RR Salakhutdinov… - Advances in neural …, 2019 - proceedings.neurips.cc
The history of learning for control has been an exciting back and forth between two broad
classes of algorithms: planning and reinforcement learning. Planning algorithms effectively …