Evolved policy gradients- 学术资源搜索

Evolved policy gradients

R Houthooft, Y Chen, P Isola, B Stadie… - Advances in …, 2018 - proceedings.neurips.cc

… which optimizes its policy to minimize … evolved policy gradient algorithm (EPG) achieves
faster learning on several randomized environments compared to an off-the-shelf policy gradient …

被引用次数：291 相关文章所有 10 个版本

[PDF] arxiv.org

Learning adaptive differential evolution algorithm from optimization experiences by policy gradient

J Sun, X Liu, T Bäck, Z Xu - IEEE Transactions on Evolutionary …, 2021 - ieeexplore.ieee.org

… A reinforcement learning algorithm, named policy gradient, is applied to learn an agent (ie, …
differential evolution adaptively during the search procedure. The differential evolution …

被引用次数：101 相关文章所有 7 个版本

[PDF] arxiv.org

Policy-based optimization: single-step policy gradient method seen as an evolution strategy

J Viquerat, R Duvigneau, P Meliga, A Kuhnle… - Neural Computing and …, 2023 - Springer

This research reports on the recent development of black-box optimization methods based
on single-step deep reinforcement learning and their conceptual similarity to evolution …

被引用次数：22 相关文章所有 13 个版本

[PDF] hal.science

Policy gradient assisted map-elites

O Nilsson, A Cully - Proceedings of the Genetic and Evolutionary …, 2021 - dl.acm.org

… Policy Gradient Assisted MAP-Elites (PGA-MAP-Elites) is an extension of MAP-Elites that …
evolving DNN controllers by combining the search power and data-efficiency of Policy Gradient …

被引用次数：100 相关文章所有 4 个版本

相关搜索

[PDF] arxiv.org

CEM-RL: Combining evolutionary and gradient-based methods for policy search

A Pourchot, O Sigaud - arXiv preprint arXiv:1810.01222, 2018 - arxiv.org

… Policy Gradient (ddpg) algorithm, a sample efficient off-policy … Deterministic policy gradient
(td3), another off-policy deep RL … mixing mechanism over the evolution of performance, for …

被引用次数：181 相关文章所有 4 个版本

Understanding features on evolutionary policy optimizations: Feature learning difference between gradient-based and evolutionary policy optimizations

S Lee, MH Ha, B Moon - Proceedings of the 35th Annual ACM …, 2020 - dl.acm.org

… using gradient descent to find the optimal agent. One of the basic policy gradient algorithms
… ES for comparison between the gradient-based method and the evolutionary algorithm. The …

被引用次数：1 相关文章所有 3 个版本

[PDF] polytechnique.fr

NEAT for large-scale reinforcement learning through evolutionary feature learning and policy gradient search

Y Peng, G Chen, H Singh, M Zhang - Proceedings of the Genetic and …, 2018 - dl.acm.org

… exact 2,000,000 samples be used for evolving good feature networks. Lastly, we will
consume another 7,990,000 samples to train policy network in the policy gradient search stage. …

被引用次数：18 相关文章

[PDF] berkeley.edu

Policy gradient methods for robotics

J Peters, S Schaal - 2006 IEEE/RSJ international conference …, 2006 - ieeexplore.ieee.org

… Policy gradient methods remain one of the few exceptions and have found a variety of
applications. Nevertheless, the application of such methods is not without peril if done in an …

被引用次数：780 相关文章所有 19 个版本

[PDF] neurips.cc

Policy gradient methods for reinforcement learning with function approximation

RS Sutton, D McAllester, S Singh… - Advances in neural …, 1999 - proceedings.neurips.cc

… gradient of the policy parameterization. Because the expression above is zero, we can subtract
it from the policy gradient … proceed without affecting the expected evolution of fw and 1r. …

被引用次数：9004 相关文章所有 35 个版本

[PDF] mlr.press

A policy gradient algorithm for learning to learn in multiagent reinforcement learning

DK Kim, M Liu, MD Riemer, C Sun… - International …, 2021 - proceedings.mlr.press

… The key idea is to model the meta-agent’s own learning process so that its updated policy
performs better than an evolving opponent. However, prior work does not directly consider the …

被引用次数：73 相关文章所有 9 个版本