Natural actor-critic for robust reinforcement learning with function approximation

R Zhou, T Liu, M Cheng, D Kalathil… - Advances in neural …, 2024 - proceedings.neurips.cc
We study robust reinforcement learning (RL) with the goal of determining a well-performing
policy that is robust against model mismatch between the training simulator and the testing …

Double pessimism is provably efficient for distributionally robust offline reinforcement learning: Generic algorithm and robust partial coverage

J Blanchet, M Lu, T Zhang… - Advances in Neural …, 2024 - proceedings.neurips.cc
We study distributionally robust offline reinforcement learning (RL), which seeks to find an
optimal robust policy purely from an offline dataset that can perform well in perturbed …

Single-trajectory distributionally robust reinforcement learning

Z Liang, X Ma, J Blanchet, J Zhang, Z Zhou - arXiv preprint arXiv …, 2023 - arxiv.org
As a framework for sequential decision-making, Reinforcement Learning (RL) has been
regarded as an essential component leading to Artificial General Intelligence (AGI) …

Model-free robust average-reward reinforcement learning

Y Wang, A Velasquez, GK Atia… - International …, 2023 - proceedings.mlr.press
Abstract Robust Markov decision processes (MDPs) address the challenge of model
uncertainty by optimizing the worst-case performance over an uncertainty set of MDPs. In …

Decentralized robust v-learning for solving markov games with model uncertainty

S Ma, Z Chen, S Zou, Y Zhou - Journal of Machine Learning Research, 2023 - jmlr.org
The Markov game is a popular reinforcement learning framework for modeling competitive
players in a dynamic environment. However, most of the existing works on Markov games …

Robust SGLD algorithm for solving non-convex distributionally robust optimisation problems

A Neufeld, MNC En, Y Zhang - arXiv preprint arXiv:2403.09532, 2024 - arxiv.org
In this paper we develop a Stochastic Gradient Langevin Dynamics (SGLD) algorithm
tailored for solving a certain class of non-convex distributionally robust optimisation …

Sequential Decision-Making under Uncertainty: A Robust MDPs review

W Ou, S Bi - arXiv preprint arXiv:2404.00940, 2024 - arxiv.org
This review paper provides an in-depth overview of the evolution and advancements in
Robust Markov Decision Processes (RMDPs), a field of paramount importance for its role in …

Robust option pricing with volatility term structure--An empirical study for variance options

AMG Cox, AM Grass - arXiv preprint arXiv:2312.09201, 2023 - arxiv.org
The robust option pricing problem is to find upper and lower bounds on fair prices of
financial claims using only the most minimal assumptions. It contrasts with the classical …

On Practical Robust Reinforcement Learning: Adjacent Uncertainty Set and Double-Agent Algorithm

U Hwang, S Hong - IEEE Transactions on Neural Networks and …, 2024 - ieeexplore.ieee.org
Robust reinforcement learning (RRL) aims to seek a robust policy by optimizing the worst
case performance over an uncertainty set. This set contains some perturbed Markov …

Regularized Q-learning through Robust Averaging

P Schmitt-Förster, T Sutter - arXiv preprint arXiv:2405.02201, 2024 - arxiv.org
We propose a new Q-learning variant, called 2RA Q-learning, that addresses some
weaknesses of existing Q-learning methods in a principled manner. One such weakness is …