Divide-and-conquer reinforcement learning

J Viquerat, P Meliga, A Larcher, E Hachem - Physics of Fluids, 2022 - pubs.aip.org

In the past couple of years, the interest of the fluid mechanics community for deep
reinforcement learning techniques has increased at fast pace, leading to a growing …

被引用次数：80 相关文章所有 10 个版本

[PDF] ieee.org

A survey on policy search algorithms for learning robot controllers in a handful of trials

K Chatzilygeroudis, V Vassiliades… - IEEE Transactions …, 2019 - ieeexplore.ieee.org

Most policy search (PS) algorithms require thousands of training episodes to find an
effective policy, which is often infeasible with a physical robot. This survey article focuses on …

被引用次数：202 相关文章所有 17 个版本

[PDF] neurips.cc

Conflict-averse gradient descent for multi-task learning

B Liu, X Liu, X Jin, P Stone… - Advances in Neural …, 2021 - proceedings.neurips.cc

The goal of multi-task learning is to enable more efficient learning than single task learning
by sharing model structures for a diverse set of tasks. A standard multi-task learning …

被引用次数：310 相关文章所有 9 个版本

[PDF] neurips.cc

Gradient surgery for multi-task learning

T Yu, S Kumar, A Gupta, S Levine… - Advances in Neural …, 2020 - proceedings.neurips.cc

While deep learning and deep reinforcement learning (RL) systems have demonstrated
impressive results in domains such as image classification, game playing, and robotic …

被引用次数：1084 相关文章所有 8 个版本

[PDF] thecvf.com

Unidexgrasp++: Improving dexterous grasping policy learning via geometry-aware curriculum and iterative generalist-specialist learning

W Wan, H Geng, Y Liu, Z Shan… - Proceedings of the …, 2023 - openaccess.thecvf.com

We propose a novel, object-agnostic method for learning a universal policy for dexterous
object grasping from realistic point cloud observations and proprioceptive information under …

被引用次数：67 相关文章所有 5 个版本

[PDF] mlr.press

Progress & compress: A scalable framework for continual learning

J Schwarz, W Czarnecki, J Luketina… - International …, 2018 - proceedings.mlr.press

We introduce a conceptually simple and scalable framework for continual learning domains
where tasks are learned sequentially. Our method is constant in the number of parameters …

被引用次数：976 相关文章所有 4 个版本

[PDF] arxiv.org

Relay policy learning: Solving long-horizon tasks via imitation and reinforcement learning

A Gupta, V Kumar, C Lynch, S Levine… - arXiv preprint arXiv …, 2019 - arxiv.org

We present relay policy learning, a method for imitation and reinforcement learning that can
solve multi-stage, long-horizon robotic tasks. This general and universally-applicable, two …

被引用次数：419 相关文章所有 6 个版本

[PDF] arxiv.org

Mt-opt: Continuous multi-task robotic reinforcement learning at scale

D Kalashnikov, J Varley, Y Chebotar… - arXiv preprint arXiv …, 2021 - arxiv.org

General-purpose robotic systems must master a large repertoire of diverse skills to be useful
in a range of daily tasks. While reinforcement learning provides a powerful framework for …

被引用次数：166 相关文章所有 2 个版本

[PDF] mlr.press

Learning by playing solving sparse reward tasks from scratch

M Riedmiller, R Hafner, T Lampe… - International …, 2018 - proceedings.mlr.press

Abstract We propose Scheduled Auxiliary Control (SAC-X), a new learning paradigm in the
context of Reinforcement Learning (RL). SAC-X enables learning of complex behaviors-from …

被引用次数：513 相关文章所有 5 个版本

[PDF] neurips.cc

Self-distillation amplifies regularization in hilbert space

H Mobahi, M Farajtabar… - Advances in Neural …, 2020 - proceedings.neurips.cc

Abstract Knowledge distillation introduced in the deep learning context is a method to
transfer knowledge from one architecture to another. In particular, when the architectures are …

被引用次数：258 相关文章所有 10 个版本