A Review of Safe Reinforcement Learning: Methods, Theories and Applications
Reinforcement Learning (RL) has achieved tremendous success in many complex decision-
making tasks. However, safety concerns are raised during deploying RL in real-world …
making tasks. However, safety concerns are raised during deploying RL in real-world …
[PDF][PDF] A comprehensive survey on safe reinforcement learning
J Garcıa, F Fernández - Journal of Machine Learning Research, 2015 - jmlr.org
Abstract Safe Reinforcement Learning can be defined as the process of learning policies
that maximize the expectation of the return in problems in which it is important to ensure …
that maximize the expectation of the return in problems in which it is important to ensure …
Nonequilibrium Markov processes conditioned on large deviations
R Chetrite, H Touchette - Annales Henri Poincaré, 2015 - Springer
We consider the problem of conditioning a Markov process on a rare event and of
representing this conditioned process by a conditioning-free process, called the effective or …
representing this conditioned process by a conditioning-free process, called the effective or …
Ergodic risk-sensitive control—a survey
A Biswas, VS Borkar - Annual Reviews in Control, 2023 - Elsevier
Risk-sensitive control has received considerable interest since the seminal work of Howard
and Matheson (Howard and Matheson, 1971/72) because of its ability to account for …
and Matheson (Howard and Matheson, 1971/72) because of its ability to account for …
A reinforcement learning approach to rare trajectory sampling
Very often when studying non-equilibrium systems one is interested in analysing dynamical
behaviour that occurs with very low probability, so called rare events. In practice, since rare …
behaviour that occurs with very low probability, so called rare events. In practice, since rare …
Variational and optimal control representations of conditioned and driven processes
R Chetrite, H Touchette - Journal of Statistical Mechanics: Theory …, 2015 - iopscience.iop.org
We have shown recently that a Markov process conditioned on rare events involving time-
integrated random variables can be described in the long-time limit by an effective Markov …
integrated random variables can be described in the long-time limit by an effective Markov …
Density constrained reinforcement learning
We study constrained reinforcement learning (CRL) from a novel perspective by setting
constraints directly on state density functions, rather than the value functions considered by …
constraints directly on state density functions, rather than the value functions considered by …
Adaptive sampling of large deviations
G Ferré, H Touchette - Journal of Statistical Physics, 2018 - Springer
We introduce and test an algorithm that adaptively estimates large deviation functions
characterizing the fluctuations of additive functionals of Markov processes in the long-time …
characterizing the fluctuations of additive functionals of Markov processes in the long-time …
Variance-constrained actor-critic algorithms for discounted and average reward MDPs
LA Prashanth, M Ghavamzadeh - Machine Learning, 2016 - Springer
In many sequential decision-making problems we may want to manage risk by minimizing
some measure of variability in rewards in addition to maximizing a standard criterion …
some measure of variability in rewards in addition to maximizing a standard criterion …
CVaR-Constrained Policy Optimization for Safe Reinforcement Learning
Current constrained reinforcement learning (RL) methods guarantee constraint satisfaction
only in expectation, which is inadequate for safety-critical decision problems. Since a …
only in expectation, which is inadequate for safety-critical decision problems. Since a …