Plas: Latent action space for offline reinforcement learning

Í Elguea-Aguinaco, A Serrano-Muñoz… - Robotics and Computer …, 2023 - Elsevier

Research and application of reinforcement learning in robotics for contact-rich manipulation
tasks have exploded in recent years. Its ability to cope with unstructured environments and …

被引用次数：60 相关文章所有 6 个版本

[PDF] neurips.cc

Model-based imitation learning for urban driving

A Hu, G Corrado, N Griffiths, Z Murez… - Advances in …, 2022 - proceedings.neurips.cc

An accurate model of the environment and the dynamic agents acting in it offers great
potential for improving motion planning. We present MILE: a Model-based Imitation …

被引用次数：98 相关文章所有 11 个版本

[PDF] neurips.cc

Combo: Conservative offline model-based policy optimization

T Yu, A Kumar, R Rafailov… - Advances in neural …, 2021 - proceedings.neurips.cc

Abstract Model-based reinforcement learning (RL) algorithms, which learn a dynamics
model from logged experience and perform conservative planning under the learned model …

被引用次数：370 相关文章所有 7 个版本

[PDF] neurips.cc

Mildly conservative q-learning for offline reinforcement learning

J Lyu, X Ma, X Li, Z Lu - Advances in Neural Information …, 2022 - proceedings.neurips.cc

Offline reinforcement learning (RL) defines the task of learning from a static logged dataset
without continually interacting with the environment. The distribution shift between the …

被引用次数：82 相关文章所有 5 个版本

[PDF] jmlr.org

d3rlpy: An offline deep reinforcement learning library

T Seno, M Imai - Journal of Machine Learning Research, 2022 - jmlr.org

In this paper, we introduce d3rlpy, an open-sourced offline deep reinforcement learning (RL)
library for Python. d3rlpy supports a set of offline deep RL algorithms as well as off-policy …

被引用次数：143 相关文章所有 6 个版本

[PDF] arxiv.org

Pessimistic bootstrapping for uncertainty-driven offline reinforcement learning

C Bai, L Wang, Z Yang, Z Deng, A Garg, P Liu… - arXiv preprint arXiv …, 2022 - arxiv.org

Offline Reinforcement Learning (RL) aims to learn policies from previously collected
datasets without exploring the environment. Directly applying off-policy algorithms to offline …

被引用次数：125 相关文章所有 5 个版本

[PDF] arxiv.org

Offline reinforcement learning via high-fidelity generative behavior modeling

H Chen, C Lu, C Ying, H Su, J Zhu - arXiv preprint arXiv:2209.14548, 2022 - arxiv.org

In offline reinforcement learning, weighted regression is a common method to ensure the
learned policy stays close to the behavior policy and to prevent selecting out-of-sample …

被引用次数：69 相关文章所有 3 个版本

[PDF] neurips.cc

Accelerating robotic reinforcement learning via parameterized action primitives

M Dalal, D Pathak… - Advances in Neural …, 2021 - proceedings.neurips.cc

Despite the potential of reinforcement learning (RL) for building general-purpose robotic
systems, training RL agents to solve robotics tasks still remains challenging due to the …

被引用次数：89 相关文章所有 8 个版本

[PDF] mlr.press

How to leverage unlabeled data in offline reinforcement learning

T Yu, A Kumar, Y Chebotar… - International …, 2022 - proceedings.mlr.press

Offline reinforcement learning (RL) can learn control policies from static datasets but, like
standard RL methods, it requires reward annotations for every transition. In many cases …

被引用次数：62 相关文章所有 5 个版本

[PDF] neurips.cc

A policy-guided imitation approach for offline reinforcement learning

H Xu, L Jiang, L Jianxiong… - Advances in Neural …, 2022 - proceedings.neurips.cc

Offline reinforcement learning (RL) methods can generally be categorized into two types: RL-
based and Imitation-based. RL-based methods could in principle enjoy out-of-distribution …

被引用次数：42 相关文章所有 7 个版本