1,110 research outputs found
LTL Control in Uncertain Environments with Probabilistic Satisfaction Guarantees
We present a method to generate a robot control strategy that maximizes the
probability to accomplish a task. The task is given as a Linear Temporal Logic
(LTL) formula over a set of properties that can be satisfied at the regions of
a partitioned environment. We assume that the probabilities with which the
properties are satisfied at the regions are known, and the robot can determine
the truth value of a proposition only at the current region. Motivated by
several results on partitioned-based abstractions, we assume that the motion is
performed on a graph. To account for noisy sensors and actuators, we assume
that a control action enables several transitions with known probabilities. We
show that this problem can be reduced to the problem of generating a control
policy for a Markov Decision Process (MDP) such that the probability of
satisfying an LTL formula over its states is maximized. We provide a complete
solution for the latter problem that builds on existing results from
probabilistic model checking. We include an illustrative case study.Comment: Technical Report accompanying IFAC 201
Learning Markov Decision Processes for Model Checking
Constructing an accurate system model for formal model verification can be
both resource demanding and time-consuming. To alleviate this shortcoming,
algorithms have been proposed for automatically learning system models based on
observed system behaviors. In this paper we extend the algorithm on learning
probabilistic automata to reactive systems, where the observed system behavior
is in the form of alternating sequences of inputs and outputs. We propose an
algorithm for automatically learning a deterministic labeled Markov decision
process model from the observed behavior of a reactive system. The proposed
learning algorithm is adapted from algorithms for learning deterministic
probabilistic finite automata, and extended to include both probabilistic and
nondeterministic transitions. The algorithm is empirically analyzed and
evaluated by learning system models of slot machines. The evaluation is
performed by analyzing the probabilistic linear temporal logic properties of
the system as well as by analyzing the schedulers, in particular the optimal
schedulers, induced by the learned models.Comment: In Proceedings QFM 2012, arXiv:1212.345
Model checking Quantitative Linear Time Logic
This paper considers QLtl, a quantitative analagon of Ltl and presents algorithms for model checking QLtl over quantitative versions of Kripke structures and Markov chains
On Frequency LTL in Probabilistic Systems
We study frequency linear-time temporal logic (fLTL) which extends the
linear-time temporal logic (LTL) with a path operator expressing that on
a path, certain formula holds with at least a given frequency p, thus relaxing
the semantics of the usual G operator of LTL. Such logic is particularly useful
in probabilistic systems, where some undesirable events such as random failures
may occur and are acceptable if they are rare enough.
Frequency-related extensions of LTL have been previously studied by several
authors, where mostly the logic is equipped with an extended "until" and
"globally" operator, leading to undecidability of most interesting problems.
For the variant we study, we are able to establish fundamental decidability
results. We show that for Markov chains, the problem of computing the
probability with which a given fLTL formula holds has the same complexity as
the analogous problem for LTL. We also show that for Markov decision processes
the problem becomes more delicate, but when restricting the frequency bound
to be 1 and negations not to be outside any operator, we can compute the
maximum probability of satisfying the fLTL formula. This can be again performed
with the same time complexity as for the ordinary LTL formulas.Comment: A paper presented at CONCUR 2015, with appendi
One Theorem to Rule Them All: A Unified Translation of LTL into {\omega}-Automata
We present a unified translation of LTL formulas into deterministic Rabin
automata, limit-deterministic B\"uchi automata, and nondeterministic B\"uchi
automata. The translations yield automata of asymptotically optimal size
(double or single exponential, respectively). All three translations are
derived from one single Master Theorem of purely logical nature. The Master
Theorem decomposes the language of a formula into a positive boolean
combination of languages that can be translated into {\omega}-automata by
elementary means. In particular, Safra's, ranking, and breakpoint constructions
used in other translations are not needed
Certified Reinforcement Learning with Logic Guidance
This paper proposes the first model-free Reinforcement Learning (RL)
framework to synthesise policies for unknown, and continuous-state Markov
Decision Processes (MDPs), such that a given linear temporal property is
satisfied. We convert the given property into a Limit Deterministic Buchi
Automaton (LDBA), namely a finite-state machine expressing the property.
Exploiting the structure of the LDBA, we shape a synchronous reward function
on-the-fly, so that an RL algorithm can synthesise a policy resulting in traces
that probabilistically satisfy the linear temporal property. This probability
(certificate) is also calculated in parallel with policy learning when the
state space of the MDP is finite: as such, the RL algorithm produces a policy
that is certified with respect to the property. Under the assumption of finite
state space, theoretical guarantees are provided on the convergence of the RL
algorithm to an optimal policy, maximising the above probability. We also show
that our method produces ''best available'' control policies when the logical
property cannot be satisfied. In the general case of a continuous state space,
we propose a neural network architecture for RL and we empirically show that
the algorithm finds satisfying policies, if there exist such policies. The
performance of the proposed framework is evaluated via a set of numerical
examples and benchmarks, where we observe an improvement of one order of
magnitude in the number of iterations required for the policy synthesis,
compared to existing approaches whenever available.Comment: This article draws from arXiv:1801.08099, arXiv:1809.0782
- …