How to certify machine learning based safety-critical systems? A systematic literature review

F Tambon, G Laberge, L An, A Nikanjam… - Automated Software …, 2022 - Springer
Abstract Context Machine Learning (ML) has been at the heart of many innovations over the
past years. However, including it in so-called “safety-critical” systems such as automotive or …

Recovery rl: Safe reinforcement learning with learned recovery zones

B Thananjeyan, A Balakrishna, S Nair… - IEEE Robotics and …, 2021 - ieeexplore.ieee.org
Safety remains a central obstacle preventing widespread use of RL in the real world:
learning new tasks in uncertain environments requires extensive exploration, but safety …

Maximum entropy RL (provably) solves some robust RL problems

B Eysenbach, S Levine - arXiv preprint arXiv:2103.06257, 2021 - arxiv.org
Many potential applications of reinforcement learning (RL) require guarantees that the agent
will perform well in the face of disturbances to the dynamics or reward function. In this paper …

Can autonomous vehicles identify, recover from, and adapt to distribution shifts?

A Filos, P Tigkas, R McAllister… - International …, 2020 - proceedings.mlr.press
Abstract Out-of-training-distribution (OOD) scenarios are a common challenge of learning
agents at deployment, typically leading to arbitrary deductions and poorly-informed …

Learning to be safe: Deep rl with a safety critic

K Srinivasan, B Eysenbach, S Ha, J Tan… - arXiv preprint arXiv …, 2020 - arxiv.org
Safety is an essential component for deploying reinforcement learning (RL) algorithms in
real-world scenarios, and is critical during the learning process itself. A natural first approach …

WCSAC: Worst-case soft actor critic for safety-constrained reinforcement learning

Q Yang, TD Simão, SH Tindemans… - Proceedings of the AAAI …, 2021 - ojs.aaai.org
Safe exploration is regarded as a key priority area for reinforcement learning research. With
separate reward and safety signals, it is natural to cast it as constrained reinforcement …

Conservative offline distributional reinforcement learning

Y Ma, D Jayaraman, O Bastani - Advances in neural …, 2021 - proceedings.neurips.cc
Many reinforcement learning (RL) problems in practice are offline, learning purely from
observational data. A key challenge is how to ensure the learned policy is safe, which …

One solution is not all you need: Few-shot extrapolation via structured maxent rl

S Kumar, A Kumar, S Levine… - Advances in Neural …, 2020 - proceedings.neurips.cc
While reinforcement learning algorithms can learn effective policies for complex tasks, these
policies are often brittle to even minor task variations, especially when variations are not …

Safety-constrained reinforcement learning with a distributional safety critic

Q Yang, TD Simão, SH Tindemans, MTJ Spaan - Machine Learning, 2023 - Springer
Safety is critical to broadening the real-world use of reinforcement learning. Modeling the
safety aspects using a safety-cost signal separate from the reward and bounding the …

Efficient risk-averse reinforcement learning

I Greenberg, Y Chow… - Advances in Neural …, 2022 - proceedings.neurips.cc
In risk-averse reinforcement learning (RL), the goal is to optimize some risk measure of the
returns. A risk measure often focuses on the worst returns out of the agent's experience. As a …