Robust q-learning algorithm for markov decision processes under wasserstein uncertainty
We present a novel Q-learning algorithm tailored to solve distributionally robust Markov
decision problems where the corresponding ambiguity set of transition probabilities for the …
decision problems where the corresponding ambiguity set of transition probabilities for the …
Robust SGLD algorithm for solving non-convex distributionally robust optimisation problems
In this paper we develop a Stochastic Gradient Langevin Dynamics (SGLD) algorithm
tailored for solving a certain class of non-convex distributionally robust optimisation …
tailored for solving a certain class of non-convex distributionally robust optimisation …
Beyond discounted returns: Robust markov decision processes with average and blackwell optimality
J Grand-Clement, M Petrik, N Vieille - arXiv preprint arXiv:2312.03618, 2023 - arxiv.org
Robust Markov Decision Processes (RMDPs) are a widely used framework for sequential
decision-making under parameter uncertainty. RMDPs have been extensively studied when …
decision-making under parameter uncertainty. RMDPs have been extensively studied when …
Policy Gradient for Robust Markov Decision Processes
We develop a generic policy gradient method with the global optimality guarantee for robust
Markov Decision Processes (MDPs). While policy gradient methods are widely used for …
Markov Decision Processes (MDPs). While policy gradient methods are widely used for …
Time-Constrained Robust MDPs
Robust reinforcement learning is essential for deploying reinforcement learning algorithms
in real-world scenarios where environmental uncertainty predominates. Traditional robust …
in real-world scenarios where environmental uncertainty predominates. Traditional robust …
Bootstrapping Expectiles in Reinforcement Learning
Many classic Reinforcement Learning (RL) algorithms rely on a Bellman operator, which
involves an expectation over the next states, leading to the concept of bootstrapping. To …
involves an expectation over the next states, leading to the concept of bootstrapping. To …
Bounding the Difference between the Values of Robust and Non-Robust Markov Decision Problems
In this note we provide an upper bound for the difference between the value function of a
distributionally robust Markov decision problem and the value function of a non-robust …
distributionally robust Markov decision problem and the value function of a non-robust …
Accelerated Policy Gradient for s-rectangular Robust MDPs with Large State Spaces
Robust Markov decision process (robust MDP) is an important machine learning framework
to make a reliable policy that is robust to environmental perturbation. Despite empirical …
to make a reliable policy that is robust to environmental perturbation. Despite empirical …
Soft Robust MDPs and Risk-Sensitive MDPs: Equivalence, Policy Gradient, and Sample Complexity
R Zhang, Y Hu, N Li - The Twelfth International Conference on Learning … - openreview.net
Robust Markov Decision Processes (MDPs) and risk-sensitive MDPs are both powerful tools
for making decisions in the presence of uncertainties. Previous efforts have aimed to …
for making decisions in the presence of uncertainties. Previous efforts have aimed to …
Robust Reinforcement Learning with General Utility
Z Chen, Y Wen, Z Hu, H Huang - The Thirty-eighth Annual Conference on … - openreview.net
Reinforcement Learning (RL) problem with general utility is a powerful decision making
framework that covers standard RL with cumulative cost, exploration problems, and …
framework that covers standard RL with cumulative cost, exploration problems, and …