Recent advances in reinforcement learning in finance

B Hambly, R Xu, H Yang - Mathematical Finance, 2023 - Wiley Online Library
The rapid changes in the finance industry due to the increasing amount of data have
revolutionized the techniques on data processing and data analysis and brought new …

Dive into deep learning

A Zhang, ZC Lipton, M Li, AJ Smola - arXiv preprint arXiv:2106.11342, 2021 - arxiv.org
This open-source book represents our attempt to make deep learning approachable,
teaching readers the concepts, the context, and the code. The entire book is drafted in …

Provably efficient reinforcement learning with linear function approximation

C Jin, Z Yang, Z Wang… - Conference on learning …, 2020 - proceedings.mlr.press
Abstract Modern Reinforcement Learning (RL) is commonly applied to practical problems
with an enormous number of states, where\emph {function approximation} must be deployed …

Nearly minimax optimal reinforcement learning for linear mixture markov decision processes

D Zhou, Q Gu, C Szepesvari - Conference on Learning …, 2021 - proceedings.mlr.press
We study reinforcement learning (RL) with linear function approximation where the
underlying transition probability kernel of the Markov decision process (MDP) is a linear …

Neural thompson sampling

W Zhang, D Zhou, L Li, Q Gu - arXiv preprint arXiv:2010.00827, 2020 - arxiv.org
Thompson Sampling (TS) is one of the most effective algorithms for solving contextual multi-
armed bandit problems. In this paper, we propose a new algorithm, called Neural Thompson …

Model-based reinforcement learning with value-targeted regression

A Ayoub, Z Jia, C Szepesvari… - … on Machine Learning, 2020 - proceedings.mlr.press
This paper studies model-based reinforcement learning (RL) for regret minimization. We
focus on finite-horizon episodic RL where the transition model $ P $ belongs to a known …

Provable benefits of actor-critic methods for offline reinforcement learning

A Zanette, MJ Wainwright… - Advances in neural …, 2021 - proceedings.neurips.cc
Actor-critic methods are widely used in offline reinforcement learningpractice, but are not so
well-understood theoretically. We propose a newoffline actor-critic algorithm that naturally …

Nearly minimax optimal reinforcement learning for linear markov decision processes

J He, H Zhao, D Zhou, Q Gu - International Conference on …, 2023 - proceedings.mlr.press
We study reinforcement learning (RL) with linear function approximation. For episodic time-
inhomogeneous linear Markov decision processes (linear MDPs) whose transition …

Representation learning for online and offline rl in low-rank mdps

M Uehara, X Zhang, W Sun - arXiv preprint arXiv:2110.04652, 2021 - arxiv.org
This work studies the question of Representation Learning in RL: how can we learn a
compact low-dimensional representation such that on top of the representation we can …

Pessimistic model-based offline reinforcement learning under partial coverage

M Uehara, W Sun - arXiv preprint arXiv:2107.06226, 2021 - arxiv.org
We study model-based offline Reinforcement Learning with general function approximation
without a full coverage assumption on the offline data distribution. We present an algorithm …