Lazy probabilistic model checking without determinisation

AK Bozkurt, Y Wang, MM Zavlanos… - 2020 IEEE International …, 2020 - ieeexplore.ieee.org

We present a reinforcement learning (RL) frame-work to synthesize a control policy from a
given linear temporal logic (LTL) specification in an unknown stochastic environment that …

被引用次数：150 相关文章所有 7 个版本

[PDF] springer.com

Omega-regular objectives in model-free reinforcement learning

EM Hahn, M Perez, S Schewe, F Somenzi… - … conference on tools and …, 2019 - Springer

We provide the first solution for model-free reinforcement learning of ω-regular objectives for
Markov decision processes (MDPs). We present a constructive reduction from the almost …

被引用次数：170 相关文章所有 15 个版本

[PDF] tum.de

Limit-deterministic Büchi automata for linear temporal logic

S Sickert, J Esparza, S Jaax, J Křetínský - International Conference on …, 2016 - Springer

Limit-deterministic Büchi automata can replace deterministic Rabin automata in probabilistic
model checking algorithms, and can be significantly smaller. We present a direct …

被引用次数：144 相关文章所有 5 个版本

[PDF] utwente.nl

JANI: quantitative model and tool interaction

CE Budde, C Dehnert, EM Hahn, A Hartmanns… - … 2017, Held as Part of the …, 2017 - Springer

The formal analysis of critical systems is supported by a vast space of modelling formalisms
and tools. The variety of incompatible formats and tools however poses a significant …

被引用次数：138 相关文章所有 12 个版本

[PDF] springer.com

Policy synthesis and reinforcement learning for discounted LTL

R Alur, O Bastani, K Jothimurugan, M Perez… - … on Computer Aided …, 2023 - Springer

The difficulty of manually specifying reward functions has led to an interest in using linear
temporal logic (LTL) to express objectives for reinforcement learning (RL). However, LTL …

被引用次数：11 相关文章所有 6 个版本

[PDF] arxiv.org

Optimal probabilistic motion planning with potential infeasible LTL constraints

M Cai, S Xiao, Z Li, Z Kan - IEEE Transactions on Automatic …, 2021 - ieeexplore.ieee.org

This paper studies optimal motion planning subject to motion and environment uncertainties.
By modeling the system as a probabilistic labeled Markov decision process (PL-MDP), the …

被引用次数：47 相关文章所有 10 个版本

[PDF] acm.org Full View

Multi-objective ω-regular reinforcement learning

EM Hahn, M Perez, S Schewe, F Somenzi… - Formal Aspects of …, 2023 - dl.acm.org

The expanding role of reinforcement learning (RL) in safety-critical system design has
promoted ω-automata as a way to express learning requirements—often non-Markovian …

被引用次数：5 相关文章所有 4 个版本

[PDF] neurips.cc

Policy optimization with linear temporal logic constraints

C Voloshin, H Le, S Chaudhuri… - Advances in Neural …, 2022 - proceedings.neurips.cc

We study the problem of policy optimization (PO) with linear temporal logic (LTL) constraints.
The language of LTL allows flexible description of tasks that may be unnatural to encode as …

被引用次数：20 相关文章所有 12 个版本

[PDF] springer.com

Mungojerrie: Linear-time objectives in model-free reinforcement learning

EM Hahn, M Perez, S Schewe, F Somenzi… - … Conference on Tools …, 2023 - Springer

Mungojerrie is an extensible tool that provides a framework to translate linear-time
objectives into reward for reinforcement learning (RL). The tool provides convergent RL …

被引用次数：6 相关文章所有 5 个版本

[PDF] nsf.gov

Translating omega-regular specifications to average objectives for model-free reinforcement learning

M Kazemi, M Perez, F Somenzi, S Soudjani… - Proc. of the 21st …, 2022 - par.nsf.gov

Recent success in reinforcement learning (RL) has brought renewed attention to the design
of reward functions by which agent behavior is reinforced or deterred. Manually designing …

被引用次数：16 相关文章所有 6 个版本