A review of robot learning for manipulation: Challenges, representations, and algorithms

O Kroemer, S Niekum, G Konidaris - Journal of machine learning research, 2021 - jmlr.org
A key challenge in intelligent robotics is creating robots that are capable of directly
interacting with the world around them to achieve their goals. The last decade has seen …

Embodied communication: How robots and people communicate through physical interaction

A Kalinowska, PM Pilarski… - Annual review of control …, 2023 - annualreviews.org
Early research on physical human–robot interaction (pHRI) has necessarily focused on
device design—the creation of compliant and sensorized hardware, such as exoskeletons …

Defining and characterizing reward gaming

J Skalse, N Howe… - Advances in Neural …, 2022 - proceedings.neurips.cc
We provide the first formal definition of\textbf {reward hacking}, a phenomenon where
optimizing an imperfect proxy reward function, $\mathcal {\tilde {R}} $, leads to poor …

Robots that ask for help: Uncertainty alignment for large language model planners

AZ Ren, A Dixit, A Bodrova, S Singh, S Tu… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs) exhibit a wide range of promising capabilities--from step-by-
step planning to commonsense reasoning--that may provide utility for robots, but remain …

Few-shot preference learning for human-in-the-loop rl

DJ Hejna III, D Sadigh - Conference on Robot Learning, 2023 - proceedings.mlr.press
While reinforcement learning (RL) has become a more popular approach for robotics,
designing sufficiently informative reward functions for complex tasks has proven to be …

Discriminator-weighted offline imitation learning from suboptimal demonstrations

H Xu, X Zhan, H Yin, H Qin - International Conference on …, 2022 - proceedings.mlr.press
We study the problem of offline Imitation Learning (IL) where an agent aims to learn an
optimal expert behavior policy without additional online environment interactions. Instead …

Ceil: Generalized contextual imitation learning

J Liu, L He, Y Kang, Z Zhuang… - Advances in Neural …, 2023 - proceedings.neurips.cc
In this paper, we present ContExtual Imitation Learning (CEIL), a general and broadly
applicable algorithm for imitation learning (IL). Inspired by the formulation of hindsight …

Learning from suboptimal demonstration via self-supervised reward regression

L Chen, R Paleja, M Gombolay - Conference on robot …, 2021 - proceedings.mlr.press
Abstract Learning from Demonstration (LfD) seeks to democratize robotics by enabling non-
roboticist end-users to teach robots to perform a task by providing a human demonstration …

Benchmarks and algorithms for offline preference-based reward learning

D Shin, AD Dragan, DS Brown - arXiv preprint arXiv:2301.01392, 2023 - arxiv.org
Learning a reward function from human preferences is challenging as it typically requires
having a high-fidelity simulator or using expensive and potentially unsafe actual physical …

Imitation learning by estimating expertise of demonstrators

M Beliaev, A Shih, S Ermon, D Sadigh… - International …, 2022 - proceedings.mlr.press
Many existing imitation learning datasets are collected from multiple demonstrators, each
with different expertise at different parts of the environment. Yet, standard imitation learning …