A review of safe reinforcement learning: Methods, theory and applications

S Gu, L Yang, Y Du, G Chen, F Walter, J Wang… - arXiv preprint arXiv …, 2022 - arxiv.org
Reinforcement learning (RL) has achieved tremendous success in many complex decision
making tasks. When it comes to deploying RL in the real world, safety concerns are usually …

Constrained variational policy optimization for safe reinforcement learning

Z Liu, Z Cen, V Isenbaev, W Liu, S Wu… - International …, 2022 - proceedings.mlr.press
Safe reinforcement learning (RL) aims to learn policies that satisfy certain constraints before
deploying them to safety-critical applications. Previous primal-dual style approaches suffer …

Crpo: A new approach for safe reinforcement learning with convergence guarantee

T Xu, Y Liang, G Lan - International Conference on Machine …, 2021 - proceedings.mlr.press
In safe reinforcement learning (SRL) problems, an agent explores the environment to
maximize an expected total reward and meanwhile avoids violation of certain constraints on …

Provably efficient safe exploration via primal-dual policy optimization

D Ding, X Wei, Z Yang, Z Wang… - … conference on artificial …, 2021 - proceedings.mlr.press
We study the safe reinforcement learning problem using the constrained Markov decision
processes in which an agent aims to maximize the expected total reward subject to a safety …

Sauté rl: Almost surely safe reinforcement learning using state augmentation

A Sootla, AI Cowen-Rivers, T Jafferjee… - International …, 2022 - proceedings.mlr.press
Satisfying safety constraints almost surely (or with probability one) can be critical for the
deployment of Reinforcement Learning (RL) in real-life applications. For example, plane …

Long-term fairness with unknown dynamics

T Yin, R Raab, M Liu, Y Liu - Advances in Neural …, 2024 - proceedings.neurips.cc
While machine learning can myopically reinforce social inequalities, it may also be used to
dynamically seek equitable outcomes. In this paper, we formalize long-term fairness as an …

Penalized proximal policy optimization for safe reinforcement learning

L Zhang, L Shen, L Yang, S Chen, B Yuan… - arXiv preprint arXiv …, 2022 - arxiv.org
Safe reinforcement learning aims to learn the optimal policy while satisfying safety
constraints, which is essential in real-world applications. However, current algorithms still …

Achieving zero constraint violation for constrained reinforcement learning via primal-dual approach

Q Bai, AS Bedi, M Agarwal, A Koppel… - Proceedings of the AAAI …, 2022 - ojs.aaai.org
Reinforcement learning is widely used in applications where one needs to perform
sequential decisions while interacting with the environment. The problem becomes more …

Model-free safe reinforcement learning through neural barrier certificate

Y Yang, Y Jiang, Y Liu, J Chen… - IEEE Robotics and …, 2023 - ieeexplore.ieee.org
Safety is a critical concern when applying reinforcement learning (RL) to real-world control
tasks. However, existing safe RL works either only consider expected safety constraint …

Safe policies for reinforcement learning via primal-dual methods

S Paternain, M Calvo-Fullana… - … on Automatic Control, 2022 - ieeexplore.ieee.org
In this article, we study the design of controllers in the context of stochastic optimal control
under the assumption that the model of the system is not available. This is, we aim to control …