More for less: Safe policy improvement with stronger performance guarantees

A Castellini, F Bianchi, E Zorzi… - International …, 2023 - proceedings.mlr.press

Algorithms for safely improving policies are important to deploy reinforcement learning
approaches in real-world scenarios. In this work, we propose an algorithm, called MCTS …

被引用次数：7 相关文章所有 13 个版本

[PDF] aisola.org

Towards a formal account on negative latency

C Dubslaff, J Schulz, P Wienhöft, C Baier… - … Conference on Bridging …, 2023 - Springer

Low latency communication is a major challenge when humans have to be integrated into
cyber physical systems with mixed realities. Recently, the concept of negative latency has …

被引用次数：1 相关文章所有 6 个版本

[PDF] arxiv.org

What Are the Odds? Improving the foundations of Statistical Model Checking

T Meggendorfer, M Weininger, P Wienhöft - arXiv preprint arXiv …, 2024 - arxiv.org

Markov decision processes (MDPs) are a fundamental model for decision making under
uncertainty. They exhibit non-deterministic choice as well as probabilistic uncertainty …

Towards alignment of Reinforcement Learning agents; for consideration of safety, robustness and fairness.

H Satija - 2024 - escholarship.mcgill.ca

Reinforcement Learning (RL) has emerged as the standard paradigm for sequential
decision-making and a framework for general intelligence. At its core, the RL problem is one …

[PDF] ru.nl

[PDF][PDF] Safe Policy Improvement in POMDPs

MR Suilen, TD Simão, N Jansen - 2023 - repository.ubn.ru.nl

Reinforcement learning (RL) is the standard approach to solve sequential decision-making
problems when environment dynamics are unknown [9]. By interacting with the environment …