Maximum a posteriori policy optimisation

Y Matsuo, Y LeCun, M Sahani, D Precup, D Silver… - Neural Networks, 2022 - Elsevier

Deep learning (DL) and reinforcement learning (RL) methods seem to be a part of
indispensable factors to achieve human-level or super-human AI systems. On the other …

被引用次数：227 相关文章所有 7 个版本

[PDF] ieee.org

A survey on offline reinforcement learning: Taxonomy, review, and open problems

RF Prudencio, MROA Maximo… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

With the widespread adoption of deep learning, reinforcement learning (RL) has
experienced a dramatic increase in popularity, scaling to previously intractable problems …

被引用次数：221 相关文章所有 9 个版本

[PDF] arxiv.org

Mastering diverse domains through world models

D Hafner, J Pasukonis, J Ba, T Lillicrap - arXiv preprint arXiv:2301.04104, 2023 - arxiv.org

Developing a general algorithm that learns to solve tasks across a wide range of
applications has been a fundamental challenge in artificial intelligence. Although current …

被引用次数：318 相关文章所有 2 个版本

[PDF] arxiv.org

A generalist agent

S Reed, K Zolna, E Parisotto, SG Colmenarejo… - arXiv preprint arXiv …, 2022 - arxiv.org

Inspired by progress in large-scale language modeling, we apply a similar approach
towards building a single generalist agent beyond the realm of text outputs. The agent …

被引用次数：763 相关文章所有 4 个版本

[PDF] jair.org Full View

A survey of zero-shot generalisation in deep reinforcement learning

R Kirk, A Zhang, E Grefenstette, T Rocktäschel - Journal of Artificial …, 2023 - jair.org

The study of zero-shot generalisation (ZSG) in deep Reinforcement Learning (RL) aims to
produce RL algorithms whose policies generalise well to novel unseen situations at …

被引用次数：312 相关文章所有 9 个版本

[HTML] science.org

Learning agile soccer skills for a bipedal robot with deep reinforcement learning

T Haarnoja, B Moran, G Lever, SH Huang… - Science Robotics, 2024 - science.org

We investigated whether deep reinforcement learning (deep RL) is able to synthesize
sophisticated and safe movement skills for a low-cost, miniature humanoid robot that can be …

被引用次数：56 相关文章所有 7 个版本

[PDF] arxiv.org

Sim-to-real transfer in deep reinforcement learning for robotics: a survey

W Zhao, JP Queralta… - 2020 IEEE symposium …, 2020 - ieeexplore.ieee.org

Deep reinforcement learning has recently seen huge success across multiple areas in the
robotics domain. Owing to the limitations of gathering real-world data, ie, sample inefficiency …

被引用次数：754 相关文章所有 4 个版本

[PDF] openreview.net

Image augmentation is all you need: Regularizing deep reinforcement learning from pixels

D Yarats, I Kostrikov, R Fergus - International conference on …, 2021 - openreview.net

We propose a simple data augmentation technique that can be applied to standard model-
free reinforcement learning algorithms, enabling robust learning directly from pixels without …

被引用次数：381 相关文章所有 6 个版本

[PDF] arxiv.org

Awac: Accelerating online reinforcement learning with offline datasets

A Nair, A Gupta, M Dalal, S Levine - arXiv preprint arXiv:2006.09359, 2020 - arxiv.org

Reinforcement learning (RL) provides an appealing formalism for learning control policies
from experience. However, the classic active formulation of RL necessitates a lengthy active …

被引用次数：491 相关文章所有 7 个版本

[PDF] arxiv.org

Behavior regularized offline reinforcement learning

Y Wu, G Tucker, O Nachum - arXiv preprint arXiv:1911.11361, 2019 - arxiv.org

In reinforcement learning (RL) research, it is common to assume access to direct online
interactions with the environment. However in many real-world applications, access to the …

被引用次数：695 相关文章所有 5 个版本