A learning algorithm for risk-sensitive cost

S Gu, L Yang, Y Du, G Chen, F Walter… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

Reinforcement Learning (RL) has achieved tremendous success in many complex decision-
making tasks. However, safety concerns are raised during deploying RL in real-world …

被引用次数：296 相关文章所有 2 个版本

[PDF] jmlr.org

[PDF][PDF] A comprehensive survey on safe reinforcement learning

J Garcıa, F Fernández - Journal of Machine Learning Research, 2015 - jmlr.org

Abstract Safe Reinforcement Learning can be defined as the process of learning policies
that maximize the expectation of the return in problems in which it is important to ensure …

被引用次数：2071 相关文章所有 5 个版本

[PDF] springer.com

Nonequilibrium Markov processes conditioned on large deviations

R Chetrite, H Touchette - Annales Henri Poincaré, 2015 - Springer

We consider the problem of conditioning a Markov process on a rare event and of
representing this conditioned process by a conditioning-free process, called the effective or …

被引用次数：364 相关文章所有 12 个版本

[PDF] arxiv.org

Ergodic risk-sensitive control—a survey

A Biswas, VS Borkar - Annual Reviews in Control, 2023 - Elsevier

Risk-sensitive control has received considerable interest since the seminal work of Howard
and Matheson (Howard and Matheson, 1971/72) because of its ability to account for …

被引用次数：17 相关文章所有 4 个版本

[PDF] iop.org Full View

A reinforcement learning approach to rare trajectory sampling

DC Rose, JF Mair, JP Garrahan - New Journal of Physics, 2021 - iopscience.iop.org

Very often when studying non-equilibrium systems one is interested in analysing dynamical
behaviour that occurs with very low probability, so called rare events. In practice, since rare …

被引用次数：66 相关文章所有 6 个版本

[PDF] arxiv.org

Variational and optimal control representations of conditioned and driven processes

R Chetrite, H Touchette - Journal of Statistical Mechanics: Theory …, 2015 - iopscience.iop.org

We have shown recently that a Markov process conditioned on rare events involving time-
integrated random variables can be described in the long-time limit by an effective Markov …

被引用次数：137 相关文章所有 8 个版本

[PDF] mlr.press

Density constrained reinforcement learning

Z Qin, Y Chen, C Fan - International conference on machine …, 2021 - proceedings.mlr.press

We study constrained reinforcement learning (CRL) from a novel perspective by setting
constraints directly on state density functions, rather than the value functions considered by …

被引用次数：36 相关文章所有 8 个版本

[PDF] arxiv.org

Adaptive sampling of large deviations

G Ferré, H Touchette - Journal of Statistical Physics, 2018 - Springer

We introduce and test an algorithm that adaptively estimates large deviation functions
characterizing the fluctuations of additive functionals of Markov processes in the long-time …

被引用次数：64 相关文章所有 12 个版本

[PDF] springer.com

Variance-constrained actor-critic algorithms for discounted and average reward MDPs

LA Prashanth, M Ghavamzadeh - Machine Learning, 2016 - Springer

In many sequential decision-making problems we may want to manage risk by minimizing
some measure of variability in rewards in addition to maximizing a standard criterion …

被引用次数：88 相关文章所有 8 个版本

CVaR-Constrained Policy Optimization for Safe Reinforcement Learning

Q Zhang, S Leng, X Ma, Q Liu, X Wang… - … on Neural Networks …, 2024 - ieeexplore.ieee.org

Current constrained reinforcement learning (RL) methods guarantee constraint satisfaction
only in expectation, which is inadequate for safety-critical decision problems. Since a …

被引用次数：9 相关文章所有 3 个版本