Recent advances in reinforcement learning in finance
The rapid changes in the finance industry due to the increasing amount of data have
revolutionized the techniques on data processing and data analysis and brought new …
revolutionized the techniques on data processing and data analysis and brought new …
[图书][B] Control systems and reinforcement learning
S Meyn - 2022 - books.google.com
A high school student can create deep Q-learning code to control her robot, without any
understanding of the meaning of'deep'or'Q', or why the code sometimes fails. This book is …
understanding of the meaning of'deep'or'Q', or why the code sometimes fails. This book is …
On the theory of policy gradient methods: Optimality, approximation, and distribution shift
Policy gradient methods are among the most effective methods in challenging reinforcement
learning problems with large state and/or action spaces. However, little is known about even …
learning problems with large state and/or action spaces. However, little is known about even …
Optimality and approximation with policy gradient methods in markov decision processes
Policy gradient (PG) methods are among the most effective methods in challenging
reinforcement learning problems with large state and/or action spaces. However, little is …
reinforcement learning problems with large state and/or action spaces. However, little is …
Natural policy gradient primal-dual method for constrained markov decision processes
We study sequential decision-making problems in which each agent aims to maximize the
expected total reward while satisfying a constraint on the expected total utility. We employ …
expected total reward while satisfying a constraint on the expected total utility. We employ …
On the global convergence rates of softmax policy gradient methods
J Mei, C Xiao, C Szepesvari… - … on machine learning, 2020 - proceedings.mlr.press
We make three contributions toward better understanding policy gradient methods in the
tabular setting. First, we show that with the true gradient, policy gradient with a softmax …
tabular setting. First, we show that with the true gradient, policy gradient with a softmax …
Fast global convergence of natural policy gradient methods with entropy regularization
Natural policy gradient (NPG) methods are among the most widely used policy optimization
algorithms in contemporary reinforcement learning. This class of methods is often applied in …
algorithms in contemporary reinforcement learning. This class of methods is often applied in …
Provably efficient exploration in policy optimization
While policy-based reinforcement learning (RL) achieves tremendous successes in practice,
it is significantly less understood in theory, especially compared with value-based RL. In …
it is significantly less understood in theory, especially compared with value-based RL. In …
Independent policy gradient methods for competitive reinforcement learning
C Daskalakis, DJ Foster… - Advances in neural …, 2020 - proceedings.neurips.cc
We obtain global, non-asymptotic convergence guarantees for independent learning
algorithms in competitive reinforcement learning settings with two agents (ie, zero-sum …
algorithms in competitive reinforcement learning settings with two agents (ie, zero-sum …
On the convergence rates of policy gradient methods
L Xiao - Journal of Machine Learning Research, 2022 - jmlr.org
We consider infinite-horizon discounted Markov decision problems with finite state and
action spaces and study the convergence rates of the projected policy gradient method and …
action spaces and study the convergence rates of the projected policy gradient method and …