Reset-free guided policy search: Efficient deep reinforcement learning with stochastic initial...

J Ibarz, J Tan, C Finn, M Kalakrishnan… - … Journal of Robotics …, 2021 - journals.sagepub.com

Deep reinforcement learning (RL) has emerged as a promising approach for autonomously
acquiring complex behaviors from low-level sensor observations. Although a large portion of …

被引用次数：667 相关文章所有 7 个版本

[PDF] ieee.org

A survey on policy search algorithms for learning robot controllers in a handful of trials

K Chatzilygeroudis, V Vassiliades… - IEEE Transactions …, 2019 - ieeexplore.ieee.org

Most policy search (PS) algorithms require thousands of training episodes to find an
effective policy, which is often infeasible with a physical robot. This survey article focuses on …

被引用次数：201 相关文章所有 17 个版本

[PDF] mlr.press

Solar: Deep structured representations for model-based reinforcement learning

M Zhang, S Vikram, L Smith, P Abbeel… - International …, 2019 - proceedings.mlr.press

Abstract Model-based reinforcement learning (RL) has proven to be a data efficient
approach for learning control tasks but is difficult to utilize in domains with complex …

被引用次数：312 相关文章所有 7 个版本

[PDF] mlr.press

Transferring end-to-end visuomotor control from simulation to real world for a multi-stage task

S James, AJ Davison, E Johns - Conference on Robot …, 2017 - proceedings.mlr.press

End-to-end control for robot manipulation and grasping is emerging as an attractive
alternative to traditional pipelined approaches. However, end-to-end methods tend to either …

被引用次数：345 相关文章所有 8 个版本

[PDF] arxiv.org

Reset-free reinforcement learning via multi-task learning: Learning dexterous manipulation behaviors without human intervention

A Gupta, J Yu, TZ Zhao, V Kumar… - … on Robotics and …, 2021 - ieeexplore.ieee.org

Reinforcement Learning (RL) algorithms can in principle acquire complex robotic skills by
learning from large amounts of data in the real world, collected via trial and error. However …

被引用次数：109 相关文章所有 4 个版本

[PDF] mlr.press

Combining model-based and model-free updates for trajectory-centric reinforcement learning

Y Chebotar, K Hausman, M Zhang… - International …, 2017 - proceedings.mlr.press

Reinforcement learning algorithms for real-world robotic applications must be able to handle
complex, unknown dynamical systems while maintaining data-efficient learning. These …

被引用次数：220 相关文章所有 8 个版本

[HTML] sciencedirect.com

[HTML][HTML] Reset-free trial-and-error learning for robot damage recovery

K Chatzilygeroudis, V Vassiliades, JB Mouret - Robotics and Autonomous …, 2018 - Elsevier

The high probability of hardware failures prevents many advanced robots (eg, legged
robots) from being confidently deployed in real-world situations (eg, post-disaster rescue) …

被引用次数：130 相关文章所有 15 个版本

[PDF] neurips.cc

Constrained cross-entropy method for safe reinforcement learning

M Wen, U Topcu - Advances in Neural Information …, 2018 - proceedings.neurips.cc

We study a safe reinforcement learning problem in which the constraints are defined as the
expected cost over finite-length trajectories. We propose a constrained cross-entropy-based …

被引用次数：106 相关文章所有 8 个版本

[PDF] arxiv.org

Reinforcement learning for non-prehensile manipulation: Transfer from simulation to physical system

K Lowrey, S Kolev, J Dao… - … and Programming for …, 2018 - ieeexplore.ieee.org

Reinforcement learning has emerged as a promising methodology for training robot
controllers. However, most results have been limited to simulation due to the need for a …

被引用次数：84 相关文章所有 6 个版本

[PDF] neurips.cc

Dual policy iteration

W Sun, GJ Gordon, B Boots… - Advances in Neural …, 2018 - proceedings.neurips.cc

Recently, a novel class of Approximate Policy Iteration (API) algorithms have demonstrated
impressive practical performance (eg, ExIt from [1], AlphaGo-Zero from [2]). This new family …

被引用次数：80 相关文章所有 8 个版本