57 research outputs found
Probabilistic Bisimulations for PCTL Model Checking of Interval MDPs
Verification of PCTL properties of MDPs with convex uncertainties has been
investigated recently by Puggelli et al. However, model checking algorithms
typically suffer from state space explosion. In this paper, we address
probabilistic bisimulation to reduce the size of such an MDPs while preserving
PCTL properties it satisfies. We discuss different interpretations of
uncertainty in the models which are studied in the literature and that result
in two different definitions of bisimulations. We give algorithms to compute
the quotients of these bisimulations in time polynomial in the size of the
model and exponential in the uncertain branching. Finally, we show by a case
study that large models in practice can have small branching and that a
substantial state space reduction can be achieved by our approach.Comment: In Proceedings SynCoP 2014, arXiv:1403.784
Multi-objective Robust Strategy Synthesis for Interval Markov Decision Processes
Interval Markov decision processes (IMDPs) generalise classical MDPs by
having interval-valued transition probabilities. They provide a powerful
modelling tool for probabilistic systems with an additional variation or
uncertainty that prevents the knowledge of the exact transition probabilities.
In this paper, we consider the problem of multi-objective robust strategy
synthesis for interval MDPs, where the aim is to find a robust strategy that
guarantees the satisfaction of multiple properties at the same time in face of
the transition probability uncertainty. We first show that this problem is
PSPACE-hard. Then, we provide a value iteration-based decision algorithm to
approximate the Pareto set of achievable points. We finally demonstrate the
practical effectiveness of our proposed approaches by applying them on several
case studies using a prototypical tool.Comment: This article is a full version of a paper accepted to the Conference
on Quantitative Evaluation of SysTems (QEST) 201
Decision algorithms for modelling, optimal control and verification of probabilistic systems
Markov Decision Processes (MDPs) constitute a mathematical framework for modelling systems featuring both probabilistic and nondeterministic behaviour. They are widely used to solve sequential decision making problems and applied successfully in operations research, arti?cial intelligence, and stochastic control theory, and have been extended conservatively to the model of probabilistic automata in the context of concurrent probabilistic systems. However, when modeling a physical system they suffer from several limitations. One of the most important is the inherent loss of precision that is introduced by measurement errors and discretization artifacts which necessarily happen due to incomplete knowledge about the system behavior. As a result, the true probability distribution for transitions is in most cases an uncertain value, determined by either external parameters or con?dence intervals. Interval Markov decision processes (IMDPs) generalize classical MDPs by having interval-valued transition probabilities. They provide a powerful modelling tool for probabilistic systems with an additional variation or uncertainty that re?ects the absence of precise knowledge concerning transition probabilities. In this dissertation, we focus on decision algorithms for modelling and performance evaluation of such probabilistic systems leveraging techniques from mathematical optimization. From a modelling viewpoint, we address probabilistic bisimulations to reduce the size of the system models while preserving the logical properties they satisfy. We also discuss the key ingredients to construct systems by composing them out of smaller components running in parallel. Furthermore, we introduce a novel stochastic model, Uncertain weighted Markov Decision Processes (UwMDPs), so as to capture quantities like preferences or priorities in a nondeterministic scenario with uncertainties. This model is close to the model of IMDPs but more convenient to work with in the context of bisimulation minimization. From a performance evaluation perspective, we consider the problem of multi-objective robust strategy synthesis for IMDPs, where the aim is to ?nd a robust strategy that guarantees the satisfaction of multiple properties at the same time in face of the transition probability uncertainty. In this respect, we discuss the computational complexity of the problem and present a value iteration-based decision algorithm to approximate the Pareto set of achievable optimal points. Moreover, we consider the problem of computing maximal/minimal reward-bounded reachability probabilities on UwMDPs, for which we present an ef?cient algorithm running in pseudo-polynomial time. We demonstrate the practical effectiveness of our proposed approaches by applying them to a collection of real-world case studies using several prototypical tools.Markov-Entscheidungsprozesse (MEPe) bilden den Rahmen für die Modellierung von Systemen, die sowohl stochastisches als auch nichtdeterministisches Verhalten beinhalten. Diese Modellklasse hat ein breites Anwendungsfeld in der Lösung sequentieller Entscheidungsprobleme und wird erfolgreich in der Operationsforschung, der künstlichen Intelligenz und in der stochastischen Kontrolltheorie eingesetzt. Im Bereich der nebenläu?gen probabilistischen Systeme wurde sie konservativ zu probabilistischen Automaten erweitert. Verwendet man MEPe jedoch zur Modellierung physikalischer Systeme so zeigt es sich, dass sie an einer Reihe von Einschränkungen leiden. Eines der schwerwiegendsten Probleme ist, dass das tatsächliche Verhalten des betrachteten Systems zumeist nicht vollständig bekannt ist. Durch Messfehler und Diskretisierungsartefakte ist ein Verlust an Genauigkeit unvermeidbar. Die tatsächlichen Übergangswahrscheinlichkeitsverteilungen des Systems sind daher in den meisten Fällen nicht exakt bekannt, sondern hängen von äußeren Faktoren ab oder können nur durch Kon?denzintervalle erfasst werden. Intervall Markov-Entscheidungsprozesse (IMEPe) verallgemeinern klassische MEPe dadurch, dass die möglichen Übergangswahrscheinlichkeitsverteilungen durch Intervalle ausgedrückt werden können. IMEPe sind daher ein mächtiges Modellierungswerkzeug für probabilistische Systeme mit unbestimmtem Verhalten, dass sich dadurch ergibt, dass das exakte Verhalten des realen Systems nicht bekannt ist. In dieser Doktorarbeit konzentrieren wir uns auf Entscheidungsverfahren für die Modellierung und die Auswertung der Eigenschaften solcher probabilistischer Systeme indem wir Methoden der mathematischen Optimierung einsetzen. Im Bereich der Modellierung betrachten wir probabilistische Bisimulation um die Größe des Systemmodells zu reduzieren während wir gleichzeitig die logischen Eigenschaften erhalten. Wir betrachten außerdem die Schlüsseltechniken um Modelle aus kleineren Komponenten, die parallel ablaufen, kompositionell zu generieren. Weiterhin führen wir eine neue Art von stochastischen Modellen ein, sogenannte Unsichere Gewichtete Markov-Entscheidungsprozesse (UgMEPe), um Eigenschaften wie Implementierungsentscheidungen und Benutzerprioritäten in einem nichtdeterministischen Szenario ausdrücken zu können. Dieses Modell ähnelt IMEPe, ist aber besser für die Minimierung bezüglich Bisimulation geeignet. Im Bereich der Auswertung von Modelleigenschaften betrachten wir das Problem, Strategien zu generieren, die in der Lage sind den Nichtdeterminismus so aufzulösen, dass mehrere gewünschte Eigenschaften gleichzeitig erfüllt werden können, wobei jede mögliche Auswahl von Wahrscheinlichkeitsverteilungen aus den Übergangsintervallen zu respektieren ist. Wir betrachten die Komplexitätsklasse dieses Problems und diskutieren einen auf Werte-Iteration beruhenden Algorithmus um die Pareto-Menge der erreichbaren optimalen Punkte anzunähern. Weiterhin betrachten wir das Problem, minimale und maximale Erreichbarkeitswahrscheinlichkeiten zu berechnen, wenn wir eine obere Grenze für dieakkumulierten Pfadkosten einhalten müssen. Für dieses Problem diskutieren wir einen ef?zienten Algorithmus mit pseudopolynomieller Zeit. Wir zeigen die Ef?zienz unserer Ansätze in der Praxis, indem wir sie prototypisch implementieren und auf eine Reihe von realistischen Fallstudien anwenden
Perturbation analysis in verification of discrete-time Markov chains
Perturbation analysis in probabilistic verification addresses the robustness and sensitivity problem for verification of stochastic models against qualitative and quantitative properties. We identify two types of perturbation bounds, namely non-asymptotic bounds and asymptotic bounds. Non-asymptotic bounds are exact, pointwise bounds that quantify the upper and lower bounds of the verification result subject to a given perturbation of the model, whereas asymptotic bounds are closed-form bounds that approximate non-asymptotic bounds by assuming that the given perturbation is sufficiently small. We perform perturbation analysis in the setting of Discrete-time Markov Chains. We consider three basic matrix norms to capture the perturbation distance, and focus on the computational aspect. Our main contributions include algorithms and tight complexity bounds for calculating both non-asymptotic bounds and asymptotic bounds with respect to the three perturbation distances. © 2014 Springer-Verlag
Adversarial Robustness Verification and Attack Synthesis in Stochastic Systems
Probabilistic model checking is a useful technique for specifying and
verifying properties of stochastic systems including randomized protocols and
reinforcement learning models. Existing methods rely on the assumed structure
and probabilities of certain system transitions. These assumptions may be
incorrect, and may even be violated by an adversary who gains control of system
components.
In this paper, we develop a formal framework for adversarial robustness in
systems modeled as discrete time Markov chains (DTMCs). We base our framework
on existing methods for verifying probabilistic temporal logic properties and
extend it to include deterministic, memoryless policies acting in Markov
decision processes (MDPs). Our framework includes a flexible approach for
specifying structure-preserving and non structure-preserving adversarial
models. We outline a class of threat models under which adversaries can perturb
system transitions, constrained by an ball around the original
transition probabilities.
We define three main DTMC adversarial robustness problems: adversarial
robustness verification, maximal synthesis, and worst case attack
synthesis. We present two optimization-based solutions to these three problems,
leveraging traditional and parametric probabilistic model checking techniques.
We then evaluate our solutions on two stochastic protocols and a collection of
Grid World case studies, which model an agent acting in an environment
described as an MDP. We find that the parametric solution results in fast
computation for small parameter spaces. In the case of less restrictive
(stronger) adversaries, the number of parameters increases, and directly
computing property satisfaction probabilities is more scalable. We demonstrate
the usefulness of our definitions and solutions by comparing system outcomes
over various properties, threat models, and case studies.Comment: To Appear, 35th IEEE Computer Security Foundations Symposium (2022
Safety-aware apprenticeship learning
It is well acknowledged in the AI community that finding a good reward function for reinforcement learning is extremely challenging. Apprenticeship learning (AL) is a class of “learning from demonstration” techniques where the reward function of a Markov Decision Process (MDP) is unknown to the learning agent and the agent uses inverse reinforcement learning (IRL) methods to recover expert policy from a set of expert demonstrations. However, as the agent learns exclusively from observations, given a constraint on the probability of the agent running into unwanted situations, there is no verification, nor guarantee, for the learnt policy on the satisfaction of the restriction. In this dissertation, we study the problem of how to guide AL to learn a policy that is inherently safe while still meeting its learning objective. By combining formal methods with imitation learning, a Counterexample-Guided Apprenticeship Learning algorithm is proposed. We consider a setting where the unknown reward function is assumed to be a linear combination of a set of state features, and the safety property is specified in Probabilistic Computation Tree Logic (PCTL). By embedding probabilistic model checking inside AL, we propose a novel counterexample-guided approach that can ensure both safety and performance of the learnt policy. This algorithm guarantees that given some formal safety specification defined by probabilistic temporal logic, the learnt policy shall satisfy this specification. We demonstrate the effectiveness of our approach on several challenging AL scenarios where safety is essential
Parameter-Independent Strategies for pMDPs via POMDPs
Markov Decision Processes (MDPs) are a popular class of models suitable for
solving control decision problems in probabilistic reactive systems. We
consider parametric MDPs (pMDPs) that include parameters in some of the
transition probabilities to account for stochastic uncertainties of the
environment such as noise or input disturbances.
We study pMDPs with reachability objectives where the parameter values are
unknown and impossible to measure directly during execution, but there is a
probability distribution known over the parameter values. We study for the
first time computing parameter-independent strategies that are expectation
optimal, i.e., optimize the expected reachability probability under the
probability distribution over the parameters. We present an encoding of our
problem to partially observable MDPs (POMDPs), i.e., a reduction of our problem
to computing optimal strategies in POMDPs.
We evaluate our method experimentally on several benchmarks: a motivating
(repeated) learner model; a series of benchmarks of varying configurations of a
robot moving on a grid; and a consensus protocol.Comment: Extended version of a QEST 2018 pape
Qualitative Reachability for Open Interval Markov Chains
Interval Markov chains extend classical Markov chains with the possibility to
describe transition probabilities using intervals, rather than exact values.
While the standard formulation of interval Markov chains features closed
intervals, previous work has considered also open interval Markov chains, in
which the intervals can also be open or half-open. In this paper we focus on
qualitative reachability problems for open interval Markov chains, which
consider whether the optimal (maximum or minimum) probability with which a
certain set of states can be reached is equal to 0 or 1. We present
polynomial-time algorithms for these problems for both of the standard
semantics of interval Markov chains. Our methods do not rely on the closure of
open intervals, in contrast to previous approaches for open interval Markov
chains, and can characterise situations in which probability 0 or 1 can be
attained not exactly but arbitrarily closely.Comment: Full version of a paper published at RP 201
- …