How to certify machine learning based safety-critical systems? A systematic literature review
Abstract Context Machine Learning (ML) has been at the heart of many innovations over the
past years. However, including it in so-called “safety-critical” systems such as automotive or …
past years. However, including it in so-called “safety-critical” systems such as automotive or …
Recovery rl: Safe reinforcement learning with learned recovery zones
Safety remains a central obstacle preventing widespread use of RL in the real world:
learning new tasks in uncertain environments requires extensive exploration, but safety …
learning new tasks in uncertain environments requires extensive exploration, but safety …
Maximum entropy RL (provably) solves some robust RL problems
B Eysenbach, S Levine - arXiv preprint arXiv:2103.06257, 2021 - arxiv.org
Many potential applications of reinforcement learning (RL) require guarantees that the agent
will perform well in the face of disturbances to the dynamics or reward function. In this paper …
will perform well in the face of disturbances to the dynamics or reward function. In this paper …
Can autonomous vehicles identify, recover from, and adapt to distribution shifts?
Abstract Out-of-training-distribution (OOD) scenarios are a common challenge of learning
agents at deployment, typically leading to arbitrary deductions and poorly-informed …
agents at deployment, typically leading to arbitrary deductions and poorly-informed …
Learning to be safe: Deep rl with a safety critic
Safety is an essential component for deploying reinforcement learning (RL) algorithms in
real-world scenarios, and is critical during the learning process itself. A natural first approach …
real-world scenarios, and is critical during the learning process itself. A natural first approach …
WCSAC: Worst-case soft actor critic for safety-constrained reinforcement learning
Safe exploration is regarded as a key priority area for reinforcement learning research. With
separate reward and safety signals, it is natural to cast it as constrained reinforcement …
separate reward and safety signals, it is natural to cast it as constrained reinforcement …
Conservative offline distributional reinforcement learning
Many reinforcement learning (RL) problems in practice are offline, learning purely from
observational data. A key challenge is how to ensure the learned policy is safe, which …
observational data. A key challenge is how to ensure the learned policy is safe, which …
One solution is not all you need: Few-shot extrapolation via structured maxent rl
While reinforcement learning algorithms can learn effective policies for complex tasks, these
policies are often brittle to even minor task variations, especially when variations are not …
policies are often brittle to even minor task variations, especially when variations are not …
Safety-constrained reinforcement learning with a distributional safety critic
Safety is critical to broadening the real-world use of reinforcement learning. Modeling the
safety aspects using a safety-cost signal separate from the reward and bounding the …
safety aspects using a safety-cost signal separate from the reward and bounding the …
Efficient risk-averse reinforcement learning
I Greenberg, Y Chow… - Advances in Neural …, 2022 - proceedings.neurips.cc
In risk-averse reinforcement learning (RL), the goal is to optimize some risk measure of the
returns. A risk measure often focuses on the worst returns out of the agent's experience. As a …
returns. A risk measure often focuses on the worst returns out of the agent's experience. As a …