[HTML][HTML] Decision-making under uncertainty: beyond probabilities: Challenges and perspectives

T Badings, TD Simão, M Suilen, N Jansen - International Journal on …, 2023 - Springer
This position paper reflects on the state-of-the-art in decision-making under uncertainty. A
classical assumption is that probabilities can sufficiently capture all uncertainty in a system …

Confidence-aware reinforcement learning for self-driving cars

Z Cao, S Xu, H Peng, D Yang… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Reinforcement learning (RL) can be used to design smart driving policies in complex
situations where traditional methods cannot. However, they are frequently black-box in …

Neural simplex architecture

DT Phan, R Grosu, N Jansen, N Paoletti… - NASA Formal Methods …, 2020 - Springer
Abstract We present the Neural Simplex Architecture (NSA), a new approach to runtime
assurance that provides safety guarantees for neural controllers (obtained eg using …

Safe policy improvement for POMDPs via finite-state controllers

TD Simão, M Suilen, N Jansen - … of the AAAI Conference on Artificial …, 2023 - ojs.aaai.org
We study safe policy improvement (SPI) for partially observable Markov decision processes
(POMDPs). SPI is an offline reinforcement learning (RL) problem that assumes access to (1) …

[PDF][PDF] Alwayssafe: Reinforcement learning without safety constraint violations during training

TD Simão, N Jansen, MTJ Spaan - 2021 - repository.ubn.ru.nl
Deploying reinforcement learning (RL) involves major concerns around safety. Engineering
a reward signal that allows the agent to maximize its performance while remaining safe is …

Scalable safe policy improvement via Monte Carlo tree search

A Castellini, F Bianchi, E Zorzi… - International …, 2023 - proceedings.mlr.press
Algorithms for safely improving policies are important to deploy reinforcement learning
approaches in real-world scenarios. In this work, we propose an algorithm, called MCTS …

Partially Observable Monte Carlo Planning with state variable constraints for mobile robot navigation

A Castellini, E Marchesini, A Farinelli - Engineering Applications of Artificial …, 2021 - Elsevier
Autonomous mobile robots employed in industrial applications often operate in complex and
uncertain environments. In this paper we propose an approach based on an extension of …

[HTML][HTML] Efficient and scalable reinforcement learning for large-scale network control

C Ma, A Li, Y Du, H Dong, Y Yang - Nature Machine Intelligence, 2024 - nature.com
The primary challenge in the development of large-scale artificial intelligence (AI) systems
lies in achieving scalable decision-making—extending the AI models while maintaining …

Safe policy improvement with soft baseline bootstrapping

K Nadjahi, R Laroche… - Machine Learning and …, 2020 - Springer
Abstract Batch Reinforcement Learning (Batch RL) consists in training a policy using
trajectories collected with another policy, called the behavioural policy. Safe policy …

Identify, estimate and bound the uncertainty of reinforcement learning for autonomous driving

W Zhou, Z Cao, N Deng, K Jiang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Deep reinforcement learning (DRL) has emerged as a promising approach for developing
more intelligent autonomous vehicles (AVs). A typical DRL application on AVs is to train a …