Compositional transfer in hierarchical reinforcement learning

M Eppe, C Gumbsch, M Kerzel, PDH Nguyen… - Nature Machine …, 2022 - nature.com

According to cognitive psychology and related disciplines, the development of complex
problem-solving behaviour in biological agents depends on hierarchical cognitive …

被引用次数：81 相关文章所有 9 个版本

[PDF] science.org

Learning agile soccer skills for a bipedal robot with deep reinforcement learning

T Haarnoja, B Moran, G Lever, SH Huang… - Science Robotics, 2024 - science.org

We investigated whether deep reinforcement learning (deep RL) is able to synthesize
sophisticated and safe movement skills for a low-cost, miniature humanoid robot that can be …

被引用次数：91 相关文章所有 7 个版本

[PDF] science.org

From motor control to team play in simulated humanoid football

S Liu, G Lever, Z Wang, J Merel, SMA Eslami… - Science Robotics, 2022 - science.org

Learning to combine control at the level of joint torques with longer-term goal-directed
behavior is a long-standing challenge for physically embodied artificial agents. Intelligent …

被引用次数：118 相关文章所有 6 个版本

[PDF] arxiv.org

The challenges of exploration for offline reinforcement learning

N Lambert, M Wulfmeier, W Whitney, A Byravan… - arXiv preprint arXiv …, 2022 - arxiv.org

Offline Reinforcement Learning (ORL) enablesus to separately study the two interlinked
processes of reinforcement learning: collecting informative experience and inferring optimal …

被引用次数：38 相关文章所有 4 个版本

[PDF] neurips.cc

Is bang-bang control all you need? solving continuous control with bernoulli policies

T Seyde, I Gilitschenski, W Schwarting… - Advances in …, 2021 - proceedings.neurips.cc

Reinforcement learning (RL) for continuous control typically employs distributions whose
support covers the entire action space. In this work, we investigate the colloquially known …

被引用次数：39 相关文章所有 8 个版本

[PDF] thecvf.com

Independent component alignment for multi-task learning

D Senushkin, N Patakin… - Proceedings of the …, 2023 - openaccess.thecvf.com

In a multi-task learning (MTL) setting, a single model is trained to tackle a diverse set of tasks
jointly. Despite rapid progress in the field, MTL remains challenging due to optimization …

被引用次数：33 相关文章所有 6 个版本

[PDF] mlr.press

Data-efficient hindsight off-policy option learning

M Wulfmeier, D Rao, R Hafner… - International …, 2021 - proceedings.mlr.press

Abstract We introduce Hindsight Off-policy Options (HO2), a data-efficient option learning
algorithm. Given any trajectory, HO2 infers likely option choices and backpropagates …

被引用次数：49 相关文章所有 6 个版本

[PDF] mlr.press

Measuring interpretability of neural policies of robots with disentangled representation

TH Wang, W Xiao, T Seyde… - Conference on Robot …, 2023 - proceedings.mlr.press

The advancement of robots, particularly those functioning in complex human-centric
environments, relies on control solutions that are driven by machine learning …

被引用次数：3 相关文章所有 2 个版本

[PDF] mlr.press

Collect & infer-a fresh look at data-efficient reinforcement learning

M Riedmiller, JT Springenberg… - … on Robot Learning, 2022 - proceedings.mlr.press

This position paper proposes a fresh look at Reinforcement Learning (RL) from the
perspective of data-efficiency. RL has gone through three major stages: pure on-line RL …

被引用次数：23 相关文章所有 4 个版本

[PDF] jmlr.org

Behavior priors for efficient reinforcement learning

D Tirumala, A Galashov, H Noh, L Hasenclever… - Journal of Machine …, 2022 - jmlr.org

As we deploy reinforcement learning agents to solve increasingly challenging problems,
methods that allow us to inject prior knowledge about the structure of the world and effective …

被引用次数：36 相关文章所有 3 个版本