Evolved policy gradients

R Houthooft, Y Chen, P Isola, B Stadie… - Advances in …, 2018 - proceedings.neurips.cc
… which optimizes its policy to minimize … evolved policy gradient algorithm (EPG) achieves
faster learning on several randomized environments compared to an off-the-shelf policy gradient

Learning adaptive differential evolution algorithm from optimization experiences by policy gradient

J Sun, X Liu, T Bäck, Z Xu - IEEE Transactions on Evolutionary …, 2021 - ieeexplore.ieee.org
… A reinforcement learning algorithm, named policy gradient, is applied to learn an agent (ie, …
differential evolution adaptively during the search procedure. The differential evolution

Policy-based optimization: single-step policy gradient method seen as an evolution strategy

J Viquerat, R Duvigneau, P Meliga, A Kuhnle… - Neural Computing and …, 2023 - Springer
This research reports on the recent development of black-box optimization methods based
on single-step deep reinforcement learning and their conceptual similarity to evolution

Policy gradient assisted map-elites

O Nilsson, A Cully - Proceedings of the Genetic and Evolutionary …, 2021 - dl.acm.org
Policy Gradient Assisted MAP-Elites (PGA-MAP-Elites) is an extension of MAP-Elites that …
evolving DNN controllers by combining the search power and data-efficiency of Policy Gradient

CEM-RL: Combining evolutionary and gradient-based methods for policy search

A Pourchot, O Sigaud - arXiv preprint arXiv:1810.01222, 2018 - arxiv.org
Policy Gradient (ddpg) algorithm, a sample efficient off-policy … Deterministic policy gradient
(td3), another off-policy deep RL … mixing mechanism over the evolution of performance, for …

Understanding features on evolutionary policy optimizations: Feature learning difference between gradient-based and evolutionary policy optimizations

S Lee, MH Ha, B Moon - Proceedings of the 35th Annual ACM …, 2020 - dl.acm.org
… using gradient descent to find the optimal agent. One of the basic policy gradient algorithms
… ES for comparison between the gradient-based method and the evolutionary algorithm. The …

NEAT for large-scale reinforcement learning through evolutionary feature learning and policy gradient search

Y Peng, G Chen, H Singh, M Zhang - Proceedings of the Genetic and …, 2018 - dl.acm.org
… exact 2,000,000 samples be used for evolving good feature networks. Lastly, we will
consume another 7,990,000 samples to train policy network in the policy gradient search stage. …

Policy gradient methods for robotics

J Peters, S Schaal - 2006 IEEE/RSJ international conference …, 2006 - ieeexplore.ieee.org
Policy gradient methods remain one of the few exceptions and have found a variety of
applications. Nevertheless, the application of such methods is not without peril if done in an …

Policy gradient methods for reinforcement learning with function approximation

RS Sutton, D McAllester, S Singh… - Advances in neural …, 1999 - proceedings.neurips.cc
gradient of the policy parameterization. Because the expression above is zero, we can subtract
it from the policy gradient … proceed without affecting the expected evolution of fw and 1r. …

A policy gradient algorithm for learning to learn in multiagent reinforcement learning

DK Kim, M Liu, MD Riemer, C Sun… - International …, 2021 - proceedings.mlr.press
… The key idea is to model the meta-agent’s own learning process so that its updated policy
performs better than an evolving opponent. However, prior work does not directly consider the …