21,077 research outputs found
Certified Reinforcement Learning with Logic Guidance
This paper proposes the first model-free Reinforcement Learning (RL)
framework to synthesise policies for unknown, and continuous-state Markov
Decision Processes (MDPs), such that a given linear temporal property is
satisfied. We convert the given property into a Limit Deterministic Buchi
Automaton (LDBA), namely a finite-state machine expressing the property.
Exploiting the structure of the LDBA, we shape a synchronous reward function
on-the-fly, so that an RL algorithm can synthesise a policy resulting in traces
that probabilistically satisfy the linear temporal property. This probability
(certificate) is also calculated in parallel with policy learning when the
state space of the MDP is finite: as such, the RL algorithm produces a policy
that is certified with respect to the property. Under the assumption of finite
state space, theoretical guarantees are provided on the convergence of the RL
algorithm to an optimal policy, maximising the above probability. We also show
that our method produces ''best available'' control policies when the logical
property cannot be satisfied. In the general case of a continuous state space,
we propose a neural network architecture for RL and we empirically show that
the algorithm finds satisfying policies, if there exist such policies. The
performance of the proposed framework is evaluated via a set of numerical
examples and benchmarks, where we observe an improvement of one order of
magnitude in the number of iterations required for the policy synthesis,
compared to existing approaches whenever available.Comment: This article draws from arXiv:1801.08099, arXiv:1809.0782
A Multi-Core Solver for Parity Games
We describe a parallel algorithm for solving parity games,\ud
with applications in, e.g., modal mu-calculus model\ud
checking with arbitrary alternations, and (branching) bisimulation\ud
checking. The algorithm is based on Jurdzinski's Small Progress\ud
Measures. Actually, this is a class of algorithms, depending on\ud
a selection heuristics.\ud
\ud
Our algorithm operates lock-free, and mostly wait-free (except for\ud
infrequent termination detection), and thus allows maximum\ud
parallelism. Additionally, we conserve memory by avoiding storage\ud
of predecessor edges for the parity graph through strictly\ud
forward-looking heuristics.\ud
\ud
We evaluate our multi-core implementation's behaviour on parity games\ud
obtained from mu-calculus model checking problems for a set of\ud
communication protocols, randomly generated problem instances, and\ud
parametric problem instances from the literature.\ud
\u
Model checking embedded system designs
We survey the basic principles behind the application of model checking to controller verification and synthesis. A promising development is the area of guided model checking, in which the state space search strategy of the model checking algorithm can be influenced to visit more interesting sets of states first. In particular, we discuss how model checking can be combined with heuristic cost functions to guide search strategies. Finally, we list a number of current research developments, especially in the area of reachability analysis for optimal control and related issues
Benchmarks for Parity Games (extended version)
We propose a benchmark suite for parity games that includes all benchmarks
that have been used in the literature, and make it available online. We give an
overview of the parity games, including a description of how they have been
generated. We also describe structural properties of parity games, and using
these properties we show that our benchmarks are representative. With this work
we provide a starting point for further experimentation with parity games.Comment: The corresponding tool and benchmarks are available from
https://github.com/jkeiren/paritygame-generator. This is an extended version
of the paper that has been accepted for FSEN 201
Rapid Recovery for Systems with Scarce Faults
Our goal is to achieve a high degree of fault tolerance through the control
of a safety critical systems. This reduces to solving a game between a
malicious environment that injects failures and a controller who tries to
establish a correct behavior. We suggest a new control objective for such
systems that offers a better balance between complexity and precision: we seek
systems that are k-resilient. In order to be k-resilient, a system needs to be
able to rapidly recover from a small number, up to k, of local faults
infinitely many times, provided that blocks of up to k faults are separated by
short recovery periods in which no fault occurs. k-resilience is a simple but
powerful abstraction from the precise distribution of local faults, but much
more refined than the traditional objective to maximize the number of local
faults. We argue why we believe this to be the right level of abstraction for
safety critical systems when local faults are few and far between. We show that
the computational complexity of constructing optimal control with respect to
resilience is low and demonstrate the feasibility through an implementation and
experimental results.Comment: In Proceedings GandALF 2012, arXiv:1210.202
Courcelle's Theorem - A Game-Theoretic Approach
Courcelle's Theorem states that every problem definable in Monadic
Second-Order logic can be solved in linear time on structures of bounded
treewidth, for example, by constructing a tree automaton that recognizes or
rejects a tree decomposition of the structure. Existing, optimized software
like the MONA tool can be used to build the corresponding tree automata, which
for bounded treewidth are of constant size. Unfortunately, the constants
involved can become extremely large - every quantifier alternation requires a
power set construction for the automaton. Here, the required space can become a
problem in practical applications.
In this paper, we present a novel, direct approach based on model checking
games, which avoids the expensive power set construction. Experiments with an
implementation are promising, and we can solve problems on graphs where the
automata-theoretic approach fails in practice.Comment: submitte
On minimising the maximum expected verification time
Cyber Physical Systems (CPSs) consist of hardware and software components. To verify that the whole (i.e., software + hardware) system meets the given specifications, exhaustive simulation-based approaches (Hardware In the Loop Simulation, HILS) can be effectively used by first generating all relevant simulation scenarios (i.e., sequences of disturbances) and then actually simulating all of them (verification phase). When considering the whole verification activity, we see that the above mentioned verification phase is repeated until no error is found. Accordingly, in order to minimise the time taken by the whole verification activity, in each verification phase we should, ideally, start by simulating scenarios witnessing errors (counterexamples). Of course, to know beforehand the set of such scenarios is not feasible. In this paper we show how to select scenarios so as to minimise the Worst Case Expected Verification Tim
- ā¦