Search CORE

12,529 research outputs found

Discounting in LTL

Author: D. Krob
E. Mandrali
F. Laroussinie
L. Alfaro de
L. Alfaro de
M. Dam
M. Droste
M. Droste
M. Faella
M. Mohri
M.Y. Vardi
O. Kupferman
O. Lichtenstein
P. Gastin
S. Almagor
S. Almagor
S. Miyano
S. Moon
Publication venue
Publication date: 01/01/2014
Field of study

In recent years, there is growing need and interest in formalizing and reasoning about the quality of software and hardware systems. As opposed to traditional verification, where one handles the question of whether a system satisfies, or not, a given specification, reasoning about quality addresses the question of \emph{how well} the system satisfies the specification. One direction in this effort is to refine the "eventually" operators of temporal logic to {\em discounting operators}: the satisfaction value of a specification is a value in

[0,1]

, where the longer it takes to fulfill eventuality requirements, the smaller the satisfaction value is. In this paper we introduce an augmentation by discounting of Linear Temporal Logic (LTL), and study it, as well as its combination with propositional quality operators. We show that one can augment LTL with an arbitrary set of discounting functions, while preserving the decidability of the model-checking problem. Further augmenting the logic with unary propositional quality operators preserves decidability, whereas adding an average-operator makes some problems undecidable. We also discuss the complexity of the problem, as well as various extensions

arXiv.org e-Print Archive

CiteSeerX

Crossref

Certified Reinforcement Learning with Logic Guidance

Author: Abate Alessandro
Hasanbeig Mohammadhosein
Kroening Daniel
Publication venue
Publication date: 10/02/2020
Field of study

This paper proposes the first model-free Reinforcement Learning (RL) framework to synthesise policies for unknown, and continuous-state Markov Decision Processes (MDPs), such that a given linear temporal property is satisfied. We convert the given property into a Limit Deterministic Buchi Automaton (LDBA), namely a finite-state machine expressing the property. Exploiting the structure of the LDBA, we shape a synchronous reward function on-the-fly, so that an RL algorithm can synthesise a policy resulting in traces that probabilistically satisfy the linear temporal property. This probability (certificate) is also calculated in parallel with policy learning when the state space of the MDP is finite: as such, the RL algorithm produces a policy that is certified with respect to the property. Under the assumption of finite state space, theoretical guarantees are provided on the convergence of the RL algorithm to an optimal policy, maximising the above probability. We also show that our method produces ''best available'' control policies when the logical property cannot be satisfied. In the general case of a continuous state space, we propose a neural network architecture for RL and we empirically show that the algorithm finds satisfying policies, if there exist such policies. The performance of the proposed framework is evaluated via a set of numerical examples and benchmarks, where we observe an improvement of one order of magnitude in the number of iterations required for the policy synthesis, compared to existing approaches whenever available.Comment: This article draws from arXiv:1801.08099, arXiv:1809.0782

arXiv.org e-Print Archive

Model checking Quantitative Linear Time Logic

Author: Faella Marco
Legay Axel
Stoelinga Mariëlle
Publication venue: Elsevier
Publication date: 01/01/2008
Field of study

This paper considers QLtl, a quantitative analagon of Ltl and presents algorithms for model checking QLtl over quantitative versions of Kripke structures and Markov chains

CiteSeerX

Archivio della ricerca - Università degli studi di Napoli Federico II

Elsevier - Publisher Connector

University of Twente Research Information

Model Checking Games for the Quantitative mu-Calculus

Author: Fischer Diana
Grädel Erich
Kaiser Lukasz
Publication venue
Publication date: 01/01/2008
Field of study

We investigate quantitative extensions of modal logic and the modal mu-calculus, and study the question whether the tight connection between logic and games can be lifted from the qualitative logics to their quantitative counterparts. It turns out that, if the quantitative mu-calculus is defined in an appropriate way respecting the duality properties between the logical operators, then its model checking problem can indeed be characterised by a quantitative variant of parity games. However, these quantitative games have quite different properties than their classical counterparts, in particular they are, in general, not positionally determined. The correspondence between the logic and the games goes both ways: the value of a formula on a quantitative transition system coincides with the value of the associated quantitative game, and conversely, the values of quantitative parity games are definable in the quantitative mu-calculus

arXiv.org e-Print Archive

Publikationsserver der RWTH Aachen University

Hal-Diderot

Equilibria-based Probabilistic Model Checking for Concurrent Stochastic Games

Author: A Bianco
A Toumi
C Dehnert
C Lemke
D Fernando
D Lozovanu
E Kelmendi
H Hansson
J Gutierrez
J Kemeny
J Pacheco
J von Neumann
K Chatterjee
K Chatterjee
K Chatterjee
K Chatterjee
L de Alfaro
L de Alfaro
L de Alfaro
L de Moura
L Shapley
M Kwiatkowska
M Kwiatkowska
M Kwiatkowska
M Kwiatkowska
M Osborne
N Basset
N Nisan
P Čermák
R Alur
R Brenguier
S Haddad
T Chen
T Chen
U Schwalbe
Publication venue
Publication date: 01/01/2019
Field of study

Probabilistic model checking for stochastic games enables formal verification of systems that comprise competing or collaborating entities operating in a stochastic environment. Despite good progress in the area, existing approaches focus on zero-sum goals and cannot reason about scenarios where entities are endowed with different objectives. In this paper, we propose probabilistic model checking techniques for concurrent stochastic games based on Nash equilibria. We extend the temporal logic rPATL (probabilistic alternating-time temporal logic with rewards) to allow reasoning about players with distinct quantitative goals, which capture either the probability of an event occurring or a reward measure. We present algorithms to synthesise strategies that are subgame perfect social welfare optimal Nash equilibria, i.e., where there is no incentive for any players to unilaterally change their strategy in any state of the game, whilst the combined probabilities or rewards are maximised. We implement our techniques in the PRISM-games tool and apply them to several case studies, including network protocols and robot navigation, showing the benefits compared to existing approaches

arXiv.org e-Print Archive

Crossref

University of Birmingham Research Portal

Oxford University Research Archive

Enlighten

Correct-by-synthesis reinforcement learning with temporal logic constraints

Author: Ehlers Ruediger
Topcu Ufuk
Wen Min
Publication venue
Publication date: 05/03/2015
Field of study

We consider a problem on the synthesis of reactive controllers that optimize some a priori unknown performance criterion while interacting with an uncontrolled environment such that the system satisfies a given temporal logic specification. We decouple the problem into two subproblems. First, we extract a (maximally) permissive strategy for the system, which encodes multiple (possibly all) ways in which the system can react to the adversarial environment and satisfy the specifications. Then, we quantify the a priori unknown performance criterion as a (still unknown) reward function and compute an optimal strategy for the system within the operating envelope allowed by the permissive strategy by using the so-called maximin-Q learning algorithm. We establish both correctness (with respect to the temporal logic specifications) and optimality (with respect to the a priori unknown performance criterion) of this two-step technique for a fragment of temporal logic specifications. For specifications beyond this fragment, correctness can still be preserved, but the learned strategy may be sub-optimal. We present an algorithm to the overall problem, and demonstrate its use and computational requirements on a set of robot motion planning examples.Comment: 8 pages, 3 figures, 2 tables, submitted to IROS 201

arXiv.org e-Print Archive

Crossref