POCE: Primal Policy Optimization with Conservative Estimation for Multi-constraint Offline Reinforcement Learning

J Guan, L Shen, A Zhou, L Li, H Hu… - Proceedings of the …, 2024 - openaccess.thecvf.com
Multi-constraint offline reinforcement learning (RL) promises to learn policies that satisfy
both cumulative and state-wise costs from offline datasets. This arrangement provides an …

Guard: A safe reinforcement learning benchmark

W Zhao, Y Sun, F Li, R Chen, R Liu, T Wei… - arXiv preprint arXiv …, 2023 - arxiv.org
Due to the trial-and-error nature, it is typically challenging to apply RL algorithms to safety-
critical real-world applications, such as autonomous driving, human-robot interaction, robot …

The Feasibility of Constrained Reinforcement Learning Algorithms: A Tutorial Study

Y Yang, Z Zheng, SE Li, M Tomizuka, C Liu - arXiv preprint arXiv …, 2024 - arxiv.org
Satisfying safety constraints is a priority concern when solving optimal control problems
(OCPs). Due to the existence of infeasibility phenomenon, where a constraint-satisfying …

[HTML][HTML] SafeRPlan: Safe deep reinforcement learning for intraoperative planning of pedicle screw placement

Y Ao, H Esfandiari, F Carrillo, CJ Laux, Y As, R Li… - Medical Image …, 2025 - Elsevier
Spinal fusion surgery requires highly accurate implantation of pedicle screw implants, which
must be conducted in critical proximity to vital structures with a limited view of the anatomy …

Learn with imagination: Safe set guided state-wise constrained policy optimization

F Li, Y Sun, W Zhao, R Chen, T Wei, C Liu - arXiv preprint arXiv …, 2023 - arxiv.org
Deep reinforcement learning (RL) excels in various control tasks, yet the absence of safety
guarantees hampers its real-world applicability. In particular, explorations during learning …

Safe Multi-Agent Reinforcement Learning with Convergence to Generalized Nash Equilibrium

Z Li, N Azizan - arXiv preprint arXiv:2411.15036, 2024 - arxiv.org
Multi-agent reinforcement learning (MARL) has achieved notable success in cooperative
tasks, demonstrating impressive performance and scalability. However, deploying MARL …

Implicit Safe Set Algorithm for Provably Safe Reinforcement Learning

W Zhao, T He, F Li, C Liu - arXiv preprint arXiv:2405.02754, 2024 - arxiv.org
Deep reinforcement learning (DRL) has demonstrated remarkable performance in many
continuous control tasks. However, a significant obstacle to the real-world application of …

POLICEd RL: Learning Closed-Loop Robot Control Policies with Provable Satisfaction of Hard Constraints

JB Bouvier, K Nagpal, N Mehr - arXiv preprint arXiv:2403.13297, 2024 - arxiv.org
In this paper, we seek to learn a robot policy guaranteed to satisfy state constraints. To
encourage constraint satisfaction, existing RL algorithms typically rely on Constrained …

Learning to Provably Satisfy High Relative Degree Constraints for Black-Box Systems

JB Bouvier, K Nagpal, N Mehr - arXiv preprint arXiv:2407.20456, 2024 - arxiv.org
In this paper, we develop a method for learning a control policy guaranteed to satisfy an
affine state constraint of high relative degree in closed loop with a black-box system …