Recent advances in reinforcement learning in finance
The rapid changes in the finance industry due to the increasing amount of data have
revolutionized the techniques on data processing and data analysis and brought new …
revolutionized the techniques on data processing and data analysis and brought new …
Policy gradient method for robust reinforcement learning
This paper develops the first policy gradient method with global optimality guarantee and
complexity analysis for robust reinforcement learning under model mismatch. Robust …
complexity analysis for robust reinforcement learning under model mismatch. Robust …
Online robust reinforcement learning with model uncertainty
Robust reinforcement learning (RL) is to find a policy that optimizes the worst-case
performance over an uncertainty set of MDPs. In this paper, we focus on model-free robust …
performance over an uncertainty set of MDPs. In this paper, we focus on model-free robust …
A theory of regularized markov decision processes
M Geist, B Scherrer, O Pietquin - … Conference on Machine …, 2019 - proceedings.mlr.press
Many recent successful (deep) reinforcement learning algorithms make use of
regularization, generally based on entropy or Kullback-Leibler divergence. We propose a …
regularization, generally based on entropy or Kullback-Leibler divergence. We propose a …
Bridging the gap between value and policy based reinforcement learning
We establish a new connection between value and policy based reinforcement learning
(RL) based on a relationship between softmax temporal value consistency and policy …
(RL) based on a relationship between softmax temporal value consistency and policy …
On the properties of the softmax function with application in game theory and reinforcement learning
In this paper, we utilize results from convex analysis and monotone operator theory to derive
additional properties of the softmax function that have not yet been covered in the existing …
additional properties of the softmax function that have not yet been covered in the existing …
SBEED: Convergent reinforcement learning with nonlinear function approximation
When function approximation is used, solving the Bellman optimality equation with stability
guarantees has remained a major open problem in reinforcement learning for decades. The …
guarantees has remained a major open problem in reinforcement learning for decades. The …
Learning mean-field games
This paper presents a general mean-field game (GMFG) framework for simultaneous
learning and decision-making in stochastic games with a large population. It first establishes …
learning and decision-making in stochastic games with a large population. It first establishes …
A unified view of entropy-regularized markov decision processes
We propose a general framework for entropy-regularized average-reward reinforcement
learning in Markov decision processes (MDPs). Our approach is based on extending the …
learning in Markov decision processes (MDPs). Our approach is based on extending the …
Finite-sample analysis for sarsa with linear function approximation
SARSA is an on-policy algorithm to learn a Markov decision process policy in reinforcement
learning. We investigate the SARSA algorithm with linear function approximation under the …
learning. We investigate the SARSA algorithm with linear function approximation under the …