Open problems and fundamental limitations of reinforcement learning from human feedback

S Casper, X Davies, C Shi, TK Gilbert… - arXiv preprint arXiv …, 2023 - arxiv.org
Reinforcement learning from human feedback (RLHF) is a technique for training AI systems
to align with human goals. RLHF has emerged as the central method used to finetune state …

Cognitive architectures for language agents

TR Sumers, S Yao, K Narasimhan… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent efforts have incorporated large language models (LLMs) with external resources (eg,
the Internet) or internal control flows (eg, prompt chaining) for tasks requiring grounding or …

Roboclip: One demonstration is enough to learn robot policies

S Sontakke, J Zhang, S Arnold… - Advances in …, 2024 - proceedings.neurips.cc
Reward specification is a notoriously difficult problem in reinforcement learning, requiring
extensive expert supervision to design robust reward functions. Imitation learning (IL) …

Interactive imitation learning in robotics: A survey

C Celemin, R Pérez-Dattari, E Chisari… - … and Trends® in …, 2022 - nowpublishers.com
Interactive Imitation Learning in Robotics: A Survey Page 1 Interactive Imitation Learning in
Robotics: A Survey Page 2 Other titles in Foundations and Trends® in Robotics A Survey on …

A survey of reinforcement learning from human feedback

T Kaufmann, P Weng, V Bengs… - arXiv preprint arXiv …, 2023 - arxiv.org
Reinforcement learning from human feedback (RLHF) is a variant of reinforcement learning
(RL) that learns from human feedback instead of relying on an engineered reward function …

Data quality in imitation learning

S Belkhale, Y Cui, D Sadigh - Advances in Neural …, 2024 - proceedings.neurips.cc
In supervised learning, the question of data quality and curation has been sidelined in
recent years in favor of increasingly more powerful and expressive models that can ingest …

Safe imitation learning via fast bayesian reward inference from preferences

D Brown, R Coleman, R Srinivasan… - … on Machine Learning, 2020 - proceedings.mlr.press
Bayesian reward learning from demonstrations enables rigorous safety and uncertainty
analysis when performing imitation learning. However, Bayesian reward learning methods …

Active preference-based gaussian process regression for reward learning

E Bıyık, N Huynh, MJ Kochenderfer… - arXiv preprint arXiv …, 2020 - arxiv.org
Designing reward functions is a challenging problem in AI and robotics. Humans usually
have a difficult time directly specifying all the desirable behaviors that a robot needs to …

When humans aren't optimal: Robots that collaborate with risk-aware humans

M Kwon, E Biyik, A Talati, K Bhasin, DP Losey… - Proceedings of the …, 2020 - dl.acm.org
In order to collaborate safely and efficiently, robots need to anticipate how their human
partners will behave. Some of today's robots model humans as if they were also robots, and …

Active preference-based Gaussian process regression for reward learning and optimization

E Bıyık, N Huynh, MJ Kochenderfer… - … Journal of Robotics …, 2024 - journals.sagepub.com
Designing reward functions is a difficult task in AI and robotics. The complex task of directly
specifying all the desirable behaviors a robot needs to optimize often proves challenging for …