- 学术资源搜索

A review of robot learning for manipulation: Challenges, representations, and algorithms

O Kroemer, S Niekum, G Konidaris - Journal of machine learning research, 2021 - jmlr.org

A key challenge in intelligent robotics is creating robots that are capable of directly
interacting with the world around them to achieve their goals. The last decade has seen …

被引用次数：426 相关文章所有 18 个版本

[PDF] annualreviews.org

Embodied communication: How robots and people communicate through physical interaction

A Kalinowska, PM Pilarski… - Annual review of control …, 2023 - annualreviews.org

Early research on physical human–robot interaction (pHRI) has necessarily focused on
device design—the creation of compliant and sensorized hardware, such as exoskeletons …

被引用次数：33 相关文章所有 4 个版本

[PDF] neurips.cc

Defining and characterizing reward gaming

J Skalse, N Howe… - Advances in Neural …, 2022 - proceedings.neurips.cc

We provide the first formal definition of\textbf {reward hacking}, a phenomenon where
optimizing an imperfect proxy reward function, $\mathcal {\tilde {R}} $, leads to poor …

被引用次数：222 相关文章所有 7 个版本

[PDF] arxiv.org

Robots that ask for help: Uncertainty alignment for large language model planners

AZ Ren, A Dixit, A Bodrova, S Singh, S Tu… - arXiv preprint arXiv …, 2023 - arxiv.org

Large language models (LLMs) exhibit a wide range of promising capabilities--from step-by-
step planning to commonsense reasoning--that may provide utility for robots, but remain …

被引用次数：175 相关文章所有 7 个版本

[PDF] mlr.press

Few-shot preference learning for human-in-the-loop rl

DJ Hejna III, D Sadigh - Conference on Robot Learning, 2023 - proceedings.mlr.press

While reinforcement learning (RL) has become a more popular approach for robotics,
designing sufficiently informative reward functions for complex tasks has proven to be …

被引用次数：84 相关文章所有 6 个版本

[PDF] mlr.press

Discriminator-weighted offline imitation learning from suboptimal demonstrations

H Xu, X Zhan, H Yin, H Qin - International Conference on …, 2022 - proceedings.mlr.press

We study the problem of offline Imitation Learning (IL) where an agent aims to learn an
optimal expert behavior policy without additional online environment interactions. Instead …

被引用次数：77 相关文章所有 10 个版本

[PDF] neurips.cc

Ceil: Generalized contextual imitation learning

J Liu, L He, Y Kang, Z Zhuang… - Advances in Neural …, 2023 - proceedings.neurips.cc

In this paper, we present ContExtual Imitation Learning (CEIL), a general and broadly
applicable algorithm for imitation learning (IL). Inspired by the formulation of hindsight …

被引用次数：15 相关文章所有 5 个版本

[PDF] mlr.press

Learning from suboptimal demonstration via self-supervised reward regression

L Chen, R Paleja, M Gombolay - Conference on robot …, 2021 - proceedings.mlr.press

Abstract Learning from Demonstration (LfD) seeks to democratize robotics by enabling non-
roboticist end-users to teach robots to perform a task by providing a human demonstration …

被引用次数：120 相关文章所有 4 个版本

[PDF] arxiv.org

Benchmarks and algorithms for offline preference-based reward learning

D Shin, AD Dragan, DS Brown - arXiv preprint arXiv:2301.01392, 2023 - arxiv.org

Learning a reward function from human preferences is challenging as it typically requires
having a high-fidelity simulator or using expensive and potentially unsafe actual physical …

被引用次数：57 相关文章所有 4 个版本

[PDF] mlr.press

Imitation learning by estimating expertise of demonstrators

M Beliaev, A Shih, S Ermon, D Sadigh… - International …, 2022 - proceedings.mlr.press

Many existing imitation learning datasets are collected from multiple demonstrators, each
with different expertise at different parts of the environment. Yet, standard imitation learning …

被引用次数：50 相关文章所有 8 个版本