A review of safe reinforcement learning: Methods, theory and applications
S Gu, L Yang, Y Du, G Chen, F Walter, J Wang… - arXiv preprint arXiv …, 2022 - arxiv.org
Reinforcement Learning (RL) has achieved tremendous success in many complex decision-
making tasks. However, safety concerns are raised during deploying RL in real-world …
making tasks. However, safety concerns are raised during deploying RL in real-world …
Constraint-conditioned policy optimization for versatile safe reinforcement learning
Y Yao, Z Liu, Z Cen, J Zhu, W Yu… - Advances in Neural …, 2024 - proceedings.neurips.cc
Safe reinforcement learning (RL) focuses on training reward-maximizing agents subject to
pre-defined safety constraints. Yet, learning versatile safe policies that can adapt to varying …
pre-defined safety constraints. Yet, learning versatile safe policies that can adapt to varying …
Meta inverse constrained reinforcement learning: Convergence guarantee and generalization analysis
S Liu, M Zhu - The Twelfth International Conference on Learning …, 2023 - openreview.net
This paper considers the problem of learning the reward function and constraints of an
expert from few demonstrations. This problem can be considered as a meta-learning …
expert from few demonstrations. This problem can be considered as a meta-learning …
Online constrained meta-learning: provable guarantees for generalization
S Xu, M Zhu - Advances in Neural Information Processing …, 2024 - proceedings.neurips.cc
Meta-learning has attracted attention due to its strong ability to learn experiences from
known tasks, which can speed up and enhance the learning process for new tasks …
known tasks, which can speed up and enhance the learning process for new tasks …
Multi-agent meta-reinforcement learning: sharper convergence rates with task similarity
W Mao, H Qiu, C Wang, H Franke… - Advances in …, 2024 - proceedings.neurips.cc
Multi-agent reinforcement learning (MARL) has primarily focused on solving a single task in
isolation, while in practice the environment is often evolving, leaving many related tasks to …
isolation, while in practice the environment is often evolving, leaving many related tasks to …
Gradient shaping for multi-constraint safe reinforcement learning
Y Yao, Z Liu, Z Cen, P Huang… - … Annual Learning for …, 2024 - proceedings.mlr.press
Online safe reinforcement learning (RL) involves training a policy that maximizes task
efficiency while satisfying constraints via interacting with the environments. In this paper, our …
efficiency while satisfying constraints via interacting with the environments. In this paper, our …
Local analysis of entropy-regularized stochastic soft-max policy gradient methods
Y Ding, J Zhang, J Lavaei - 2023 European Control Conference …, 2023 - ieeexplore.ieee.org
Entropy regularization is an efficient technique for encouraging exploration and preventing a
premature convergence of (vanilla) policy gradient methods in reinforcement learning (RL) …
premature convergence of (vanilla) policy gradient methods in reinforcement learning (RL) …
Preparing for Black Swans: The Antifragility Imperative for Machine Learning
M Jin - arXiv preprint arXiv:2405.11397, 2024 - arxiv.org
Operating safely and reliably despite continual distribution shifts is vital for high-stakes
machine learning applications. This paper builds upon the transformative concept …
machine learning applications. This paper builds upon the transformative concept …
Safe machine learning for intelligent multi-robot systems
Z Yuan - 2024 - etda.libraries.psu.edu
Recent advances in embedded computing and mobile sensing have led to pervasive use of
robotic systems in both civil and military applications. With single autonomous robots for …
robotic systems in both civil and military applications. With single autonomous robots for …
[PDF][PDF] Robust Position Estimation using Range Measurements from Transmitters with Inaccurate Positions
A Sel, S Hayek, ZM Kassas - people.engineering.osu.edu
The problem of position estimation using range measurements from transmitters with
inaccurately known positions is considered. The true position of each transmitter is assumed …
inaccurately known positions is considered. The true position of each transmitter is assumed …