Natural policy gradient primal-dual method for constrained markov decision processes

S Gu, L Yang, Y Du, G Chen, F Walter, J Wang… - arXiv preprint arXiv …, 2022 - arxiv.org

Reinforcement learning (RL) has achieved tremendous success in many complex decision
making tasks. When it comes to deploying RL in the real world, safety concerns are usually …

被引用次数：201 相关文章所有 2 个版本

[PDF] mlr.press

Constrained variational policy optimization for safe reinforcement learning

Z Liu, Z Cen, V Isenbaev, W Liu, S Wu… - International …, 2022 - proceedings.mlr.press

Safe reinforcement learning (RL) aims to learn policies that satisfy certain constraints before
deploying them to safety-critical applications. Previous primal-dual style approaches suffer …

被引用次数：69 相关文章所有 6 个版本

[PDF] mlr.press

Crpo: A new approach for safe reinforcement learning with convergence guarantee

T Xu, Y Liang, G Lan - International Conference on Machine …, 2021 - proceedings.mlr.press

In safe reinforcement learning (SRL) problems, an agent explores the environment to
maximize an expected total reward and meanwhile avoids violation of certain constraints on …

被引用次数：123 相关文章所有 7 个版本

[PDF] mlr.press

Provably efficient safe exploration via primal-dual policy optimization

D Ding, X Wei, Z Yang, Z Wang… - … conference on artificial …, 2021 - proceedings.mlr.press

We study the safe reinforcement learning problem using the constrained Markov decision
processes in which an agent aims to maximize the expected total reward subject to a safety …

被引用次数：160 相关文章所有 9 个版本

[PDF] mlr.press

Sauté rl: Almost surely safe reinforcement learning using state augmentation

A Sootla, AI Cowen-Rivers, T Jafferjee… - International …, 2022 - proceedings.mlr.press

Satisfying safety constraints almost surely (or with probability one) can be critical for the
deployment of Reinforcement Learning (RL) in real-life applications. For example, plane …

被引用次数：51 相关文章所有 7 个版本

[PDF] neurips.cc

Long-term fairness with unknown dynamics

T Yin, R Raab, M Liu, Y Liu - Advances in Neural …, 2024 - proceedings.neurips.cc

While machine learning can myopically reinforce social inequalities, it may also be used to
dynamically seek equitable outcomes. In this paper, we formalize long-term fairness as an …

被引用次数：18 相关文章所有 8 个版本

[PDF] arxiv.org

Penalized proximal policy optimization for safe reinforcement learning

L Zhang, L Shen, L Yang, S Chen, B Yuan… - arXiv preprint arXiv …, 2022 - arxiv.org

Safe reinforcement learning aims to learn the optimal policy while satisfying safety
constraints, which is essential in real-world applications. However, current algorithms still …

被引用次数：51 相关文章所有 4 个版本

[PDF] aaai.org

Achieving zero constraint violation for constrained reinforcement learning via primal-dual approach

Q Bai, AS Bedi, M Agarwal, A Koppel… - Proceedings of the AAAI …, 2022 - ojs.aaai.org

Reinforcement learning is widely used in applications where one needs to perform
sequential decisions while interacting with the environment. The problem becomes more …

被引用次数：61 相关文章所有 6 个版本

[PDF] researchgate.net

Model-free safe reinforcement learning through neural barrier certificate

Y Yang, Y Jiang, Y Liu, J Chen… - IEEE Robotics and …, 2023 - ieeexplore.ieee.org

Safety is a critical concern when applying reinforcement learning (RL) to real-world control
tasks. However, existing safe RL works either only consider expected safety constraint …

被引用次数：28 相关文章所有 2 个版本

[PDF] arxiv.org

Safe policies for reinforcement learning via primal-dual methods

S Paternain, M Calvo-Fullana… - … on Automatic Control, 2022 - ieeexplore.ieee.org

In this article, we study the design of controllers in the context of stochastic optimal control
under the assumption that the model of the system is not available. This is, we aim to control …

被引用次数：107 相关文章所有 4 个版本