A Review of Safe Reinforcement Learning: Methods, Theories and Applications

S Gu, L Yang, Y Du, G Chen, F Walter… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
Reinforcement Learning (RL) has achieved tremendous success in many complex decision-
making tasks. However, safety concerns are raised during deploying RL in real-world …

[PDF][PDF] A comprehensive survey on safe reinforcement learning

J Garcıa, F Fernández - Journal of Machine Learning Research, 2015 - jmlr.org
Abstract Safe Reinforcement Learning can be defined as the process of learning policies
that maximize the expectation of the return in problems in which it is important to ensure …

Nonequilibrium Markov processes conditioned on large deviations

R Chetrite, H Touchette - Annales Henri Poincaré, 2015 - Springer
We consider the problem of conditioning a Markov process on a rare event and of
representing this conditioned process by a conditioning-free process, called the effective or …

Ergodic risk-sensitive control—a survey

A Biswas, VS Borkar - Annual Reviews in Control, 2023 - Elsevier
Risk-sensitive control has received considerable interest since the seminal work of Howard
and Matheson (Howard and Matheson, 1971/72) because of its ability to account for …

A reinforcement learning approach to rare trajectory sampling

DC Rose, JF Mair, JP Garrahan - New Journal of Physics, 2021 - iopscience.iop.org
Very often when studying non-equilibrium systems one is interested in analysing dynamical
behaviour that occurs with very low probability, so called rare events. In practice, since rare …

Variational and optimal control representations of conditioned and driven processes

R Chetrite, H Touchette - Journal of Statistical Mechanics: Theory …, 2015 - iopscience.iop.org
We have shown recently that a Markov process conditioned on rare events involving time-
integrated random variables can be described in the long-time limit by an effective Markov …

Density constrained reinforcement learning

Z Qin, Y Chen, C Fan - International conference on machine …, 2021 - proceedings.mlr.press
We study constrained reinforcement learning (CRL) from a novel perspective by setting
constraints directly on state density functions, rather than the value functions considered by …

Adaptive sampling of large deviations

G Ferré, H Touchette - Journal of Statistical Physics, 2018 - Springer
We introduce and test an algorithm that adaptively estimates large deviation functions
characterizing the fluctuations of additive functionals of Markov processes in the long-time …

Variance-constrained actor-critic algorithms for discounted and average reward MDPs

LA Prashanth, M Ghavamzadeh - Machine Learning, 2016 - Springer
In many sequential decision-making problems we may want to manage risk by minimizing
some measure of variability in rewards in addition to maximizing a standard criterion …

CVaR-Constrained Policy Optimization for Safe Reinforcement Learning

Q Zhang, S Leng, X Ma, Q Liu, X Wang… - … on Neural Networks …, 2024 - ieeexplore.ieee.org
Current constrained reinforcement learning (RL) methods guarantee constraint satisfaction
only in expectation, which is inadequate for safety-critical decision problems. Since a …