Recent advances in reinforcement learning in finance

B Hambly, R Xu, H Yang - Mathematical Finance, 2023 - Wiley Online Library
The rapid changes in the finance industry due to the increasing amount of data have
revolutionized the techniques on data processing and data analysis and brought new …

Hierarchical reinforcement learning: A survey and open research challenges

M Hutsebaut-Buysse, K Mets, S Latré - Machine Learning and Knowledge …, 2022 - mdpi.com
Reinforcement learning (RL) allows an agent to solve sequential decision-making problems
by interacting with an environment in a trial-and-error fashion. When these environments are …

Policy gradient and actor-critic learning in continuous time and space: Theory and algorithms

Y Jia, XY Zhou - Journal of Machine Learning Research, 2022 - jmlr.org
We study policy gradient (PG) for reinforcement learning in continuous time and space
under the regularized exploratory formulation developed by Wang et al.(2020). We …

Continuous‐time mean–variance portfolio selection: A reinforcement learning framework

H Wang, XY Zhou - Mathematical Finance, 2020 - Wiley Online Library
We approach the continuous‐time mean–variance portfolio selection with reinforcement
learning (RL). The problem is to achieve the best trade‐off between exploration and …

Policy evaluation and temporal-difference learning in continuous time and space: A martingale approach

Y Jia, XY Zhou - Journal of Machine Learning Research, 2022 - jmlr.org
We propose a unified framework to study policy evaluation (PE) and the associated temporal
difference (TD) methods for reinforcement learning in continuous time and space. We show …

Entropy regularization for mean field games with learning

X Guo, R Xu, T Zariphopoulou - Mathematics of Operations …, 2022 - pubsonline.informs.org
Entropy regularization has been extensively adopted to improve the efficiency, the stability,
and the convergence of algorithms in reinforcement learning. This paper analyzes both …

Machine learning for optical fiber communication systems: An introduction and overview

JW Nevin, S Nallaperuma, NA Shevchenko, X Li… - Apl Photonics, 2021 - pubs.aip.org
Optical networks generate a vast amount of diagnostic, control, and performance monitoring
data. When information is extracted from these data, reconfigurable network elements and …

A novel exploration-exploitation-based adaptive law for intelligent model-free control approaches

O Tutsoy, DE Barkana, K Balikci - IEEE transactions on …, 2021 - ieeexplore.ieee.org
Model-free control approaches require advanced exploration-exploitation policies to
achieve practical tasks such as learning to bipedal robot walk in unstructured environments …

Efficient exploration in continuous-time model-based reinforcement learning

L Treven, J Hübotter, F Dorfler… - Advances in Neural …, 2024 - proceedings.neurips.cc
Reinforcement learning algorithms typically consider discrete-time dynamics, even though
the underlying systems are often continuous in time. In this paper, we introduce a model …

Policy optimization for continuous reinforcement learning

H Zhao, W Tang, D Yao - Advances in Neural Information …, 2024 - proceedings.neurips.cc
We study reinforcement learning (RL) in the setting of continuous time and space, for an
infinite horizon with a discounted objective and the underlying dynamics driven by a …