PAC statistical model checking for Markov decision processes and stochastic games

T Badings, TD Simão, M Suilen, N Jansen - International Journal on …, 2023 - Springer

This position paper reflects on the state-of-the-art in decision-making under uncertainty. A
classical assumption is that probabilities can sufficiently capture all uncertainty in a system …

被引用次数：13 相关文章所有 8 个版本

[PDF] jair.org Full View

Robust control for dynamical systems with non-gaussian noise via formal abstractions

T Badings, L Romao, A Abate, D Parker… - Journal of Artificial …, 2023 - jair.org

Controllers for dynamical systems that operate in safety-critical settings must account for
stochastic disturbances. Such disturbances are often modeled as process noise in a …

被引用次数：31 相关文章所有 16 个版本

[PDF] neurips.cc

Robust anytime learning of Markov decision processes

M Suilen, TD Simão, D Parker… - Advances in Neural …, 2022 - proceedings.neurips.cc

Markov decision processes (MDPs) are formal models commonly used in sequential
decision-making. MDPs capture the stochasticity that may arise, for instance, from imprecise …

被引用次数：25 相关文章所有 12 个版本

[PDF] arxiv.org

Multi-agent reinforcement learning with temporal logic specifications

L Hammond, A Abate, J Gutierrez… - arXiv preprint arXiv …, 2021 - arxiv.org

In this paper, we study the problem of learning to satisfy temporal logic specifications with a
group of agents in an unknown environment, which may exhibit probabilistic behaviour …

被引用次数：47 相关文章所有 7 个版本

[PDF] springer.com

Optimistic value iteration

A Hartmanns, BL Kaminski - International Conference on Computer Aided …, 2020 - Springer

Markov decision processes are widely used for planning and verification in settings that
combine controllable or adversarial choices with probabilistic behaviour. The standard …

被引用次数：58 相关文章所有 9 个版本

[PDF] arxiv.org

A framework for transforming specifications in reinforcement learning

R Alur, S Bansal, O Bastani, K Jothimurugan - … on the Occasion of His 60th …, 2022 - Springer

Reactive synthesis algorithms allow automatic construction of policies to control an
environment modeled as a Markov Decision Process (MDP) that are optimal with respect to …

被引用次数：32 相关文章所有 8 个版本

[PDF] springer.com

Policy synthesis and reinforcement learning for discounted LTL

R Alur, O Bastani, K Jothimurugan, M Perez… - … on Computer Aided …, 2023 - Springer

The difficulty of manually specifying reward functions has led to an interest in using linear
temporal logic (LTL) to express objectives for reinforcement learning (RL). However, LTL …

被引用次数：10 相关文章所有 6 个版本

[PDF] arxiv.org

Constructing MDP abstractions using data with formal guarantees

A Lavaei, S Soudjani, E Frazzoli… - IEEE Control Systems …, 2022 - ieeexplore.ieee.org

This letter is concerned with a data-driven technique for constructing finite Markov decision
processes (MDPs) as finite abstractions of discrete-time stochastic control systems with …

被引用次数：25 相关文章所有 6 个版本

[PDF] unitn.it

On correctness, precision, and performance in quantitative verification: QComp 2020 competition report

CE Budde, A Hartmanns, M Klauck, J Křetínský… - … applications of formal …, 2020 - Springer

Quantitative verification tools compute probabilities, expected rewards, or steady-state
values for formal models of stochastic and timed systems. Exact results often cannot be …

被引用次数：37 相关文章所有 19 个版本

[PDF] aaai.org

A PAC learning algorithm for LTL and omega-regular objectives in MDPs

M Perez, F Somenzi, A Trivedi - … of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org

Linear temporal logic (LTL) and omega-regular objectives---a superset of LTL---have seen
recent use as a way to express non-Markovian objectives in reinforcement learning. We …

被引用次数：4 相关文章所有 3 个版本