Open problems and fundamental limitations of reinforcement learning from human feedback
Reinforcement learning from human feedback (RLHF) is a technique for training AI systems
to align with human goals. RLHF has emerged as the central method used to finetune state …
to align with human goals. RLHF has emerged as the central method used to finetune state …
Cognitive architectures for language agents
Recent efforts have incorporated large language models (LLMs) with external resources (eg,
the Internet) or internal control flows (eg, prompt chaining) for tasks requiring grounding or …
the Internet) or internal control flows (eg, prompt chaining) for tasks requiring grounding or …
Roboclip: One demonstration is enough to learn robot policies
Reward specification is a notoriously difficult problem in reinforcement learning, requiring
extensive expert supervision to design robust reward functions. Imitation learning (IL) …
extensive expert supervision to design robust reward functions. Imitation learning (IL) …
Interactive imitation learning in robotics: A survey
Interactive Imitation Learning in Robotics: A Survey Page 1 Interactive Imitation Learning in
Robotics: A Survey Page 2 Other titles in Foundations and Trends® in Robotics A Survey on …
Robotics: A Survey Page 2 Other titles in Foundations and Trends® in Robotics A Survey on …
A survey of reinforcement learning from human feedback
Reinforcement learning from human feedback (RLHF) is a variant of reinforcement learning
(RL) that learns from human feedback instead of relying on an engineered reward function …
(RL) that learns from human feedback instead of relying on an engineered reward function …
Data quality in imitation learning
In supervised learning, the question of data quality and curation has been sidelined in
recent years in favor of increasingly more powerful and expressive models that can ingest …
recent years in favor of increasingly more powerful and expressive models that can ingest …
Safe imitation learning via fast bayesian reward inference from preferences
Bayesian reward learning from demonstrations enables rigorous safety and uncertainty
analysis when performing imitation learning. However, Bayesian reward learning methods …
analysis when performing imitation learning. However, Bayesian reward learning methods …
Active preference-based gaussian process regression for reward learning
Designing reward functions is a challenging problem in AI and robotics. Humans usually
have a difficult time directly specifying all the desirable behaviors that a robot needs to …
have a difficult time directly specifying all the desirable behaviors that a robot needs to …
When humans aren't optimal: Robots that collaborate with risk-aware humans
In order to collaborate safely and efficiently, robots need to anticipate how their human
partners will behave. Some of today's robots model humans as if they were also robots, and …
partners will behave. Some of today's robots model humans as if they were also robots, and …
Active preference-based Gaussian process regression for reward learning and optimization
Designing reward functions is a difficult task in AI and robotics. The complex task of directly
specifying all the desirable behaviors a robot needs to optimize often proves challenging for …
specifying all the desirable behaviors a robot needs to optimize often proves challenging for …