Search CORE

2,440 research outputs found

Modelling and analysis of Markov reward automata (extended version)

Author: Guck Dennis
Hatefi Hassan
Ruijters Enno
Stoelinga Mariëlle
Timmer Mark
Publication venue: University of Twente, Centre for Telematics and Information Technology (CTIT)
Publication date: 01/01/2014
Field of study

Costs and rewards are important ingredients for cyberphysical systems, modelling critical aspects like energy consumption, task completion, repair costs, and memory usage. This paper introduces Markov reward automata, an extension of Markov automata that allows the modelling of systems incorporating rewards (or costs) in addition to nondeterminism, discrete probabilistic choice and continuous stochastic timing. Rewards come in two flavours: action rewards, acquired instantaneously when taking a transition; and state rewards, acquired while residing in a state. We present algorithms to optimise three reward functions: the expected accumulative reward until a goal is reached; the expected accumulative reward until a certain time bound; and the long-run average reward. We have implemented these algorithms in the SCOOP/IMCA tool chain and show their feasibility via several case studies

University of Twente Research Information

Modelling and analysis of Markov reward automata

Author: C. Eisentraut
C. Eisentraut
D. Guck
D. Guck
H. Boudali
J. Pol van de
J.-P. Katoen
M. Bernardo
M. Bozzano
M. Timmer
M. Timmer
M.M. Srinivasan
M.R. Neuhäußer
S. Andova
Y. Deng
Publication venue: Springer
Publication date: 01/01/2014
Field of study

Costs and rewards are important ingredients for many types of systems, modelling critical aspects like energy consumption, task completion, repair costs, and memory usage. This paper introduces Markov reward automata, an extension of Markov automata that allows the modelling of systems incorporating rewards (or costs) in addition to nondeterminism, discrete probabilistic choice and continuous stochastic timing. Rewards come in two flavours: action rewards, acquired instantaneously when taking a transition; and state rewards, acquired while residing in a state. We present algorithms to optimise three reward functions: the expected cumulative reward until a goal is reached, the expected cumulative reward until a certain time bound, and the long-run average reward. We have implemented these algorithms in the SCOOP/IMCA tool chain and show their feasibility via several case studies

Crossref

University of Twente Research Information

Towards efficient analysis of Markov automata

Author: Butkova Yuliya
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2020
Field of study

One of the most expressive formalisms to model concurrent systems is Markov automata. They serve as a semantics for many higher-level formalisms, such as generalised stochastic Petri nets and dynamic fault trees. Two of the most challenging problems for Markov automata to date are (i) the optimal time-bounded reachability probability and (ii) the optimal long-run average rewards. In this thesis, we aim at designing efficient sound techniques to analyse them. We approach the problem of time-bounded reachability from two different angles. First, we study the properties of the optimal solution and exploit this knowledge to construct an efficient algorithm that approximates the optimal values up to a guaranteed error bound. This algorithm is exhaustive, i. e. it computes values for each state of the Markov automaton. This may be a limitation for very large or even infinite Markov automata. To address this issue we design a second algorithm that approximates the optimal solution by only working with part of the total state-space. For the problem of long-run average rewards there exists a polynomial algorithm based on linear programming. Instead of chasing a better theoretical complexity bound we search for a practical solution based on an iterative approach. We design a value iteration algorithm that in our empirical evaluation turns out to scale several orders of magnitude better than the linear programming based approach.Markov-Automaten bilden einen der ausdrucksstärksten Formalismen um Nebenläufige Systeme zu modellieren. Sie werden benutzt um die Semantik vieler höherer Formalismen wie stochastischer Petri-Netze [Mar95, EHZ10] und Dynamic Fault Trees [DBB90] zu beschreiben. Die zwei herausfordernder Probleme im Bereich der Analyse großer Markov- Automaten sind (i) die zeitbeschränkten Erreichbarkeitwahrscheinlichkeit und (ii) optimale langfristige durchschnittliche Rewards. Diese Arbeit zielt auf das Design effizienter und korrekter Techniken um sie zu untersuchen. Das Problem der zeitbeschränkten Erreichbarkeitswahrscheinlichkeit gehen wir aus zwei verschiedenen Richtungen an: Zum einen studieren wir die Eigenschaften optimaler Lösungen und nutzen dieses Wissen um einen effizienten Approximationsalgorithmus zu bilden, der optimale Werte bis auf eine garantierte Fehlertoleranz berechnet. Dieser Algorithmus basiert darauf, Werte für jeden Zustand des Markov-Automaten zu berechnen. Dies kann die Anwendbarkeit für große oder gar unendliche Automaten einschränken. Um diese Problem zu lösen präsentieren wir einen zweiten Algorithmus, der die optimale Lösung approximiert, und dabei ausschließlich einen Teil des Zustandsraumes betrachtet. Für das Problem der optimalen langfristigen durchschnittlichen Rewards gibt es einen polynomiellen Algorithmus auf Basis linearer Programmierung. Anstelle eine bessere theoretische Komplexität anzustreben, konzentrieren wir uns darauf, eine praktische Lösung auf Basis eines iterativen Ansatzes zu finden. Wie entwickeln einen Werte-iterierenden Algorithmus der in unserer empirischen Evaluation um mehrere Größenordnungen besser als der auf linearer Programmierung basierende Ansatz skaliert

Universaar

Acronym

The Complexity of POMDPs with Long-run Average Objectives

Author: Chatterjee Krishnendu
Saona Raimundo
Ziliotto Bruno
Publication venue
Publication date: 30/04/2019
Field of study

We study the problem of approximation of optimal values in partially-observable Markov decision processes (POMDPs) with long-run average objectives. POMDPs are a standard model for dynamic systems with probabilistic and nondeterministic behavior in uncertain environments. In long-run average objectives rewards are associated with every transition of the POMDP and the payoff is the long-run average of the rewards along the executions of the POMDP. We establish strategy complexity and computational complexity results. Our main result shows that finite-memory strategies suffice for approximation of optimal values, and the related decision problem is recursively enumerable complete

arXiv.org e-Print Archive

IST Austria: PubRep (Institute of Science and Technology)

One-Counter Stochastic Games

Author: Brožek Václav
Brázdil Tomáš
Etessami Kousha
Publication venue
Publication date: 01/01/2010
Field of study

We study the computational complexity of basic decision problems for one-counter simple stochastic games (OC-SSGs), under various objectives. OC-SSGs are 2-player turn-based stochastic games played on the transition graph of classic one-counter automata. We study primarily the termination objective, where the goal of one player is to maximize the probability of reaching counter value 0, while the other player wishes to avoid this. Partly motivated by the goal of understanding termination objectives, we also study certain "limit" and "long run average" reward objectives that are closely related to some well-studied objectives for stochastic games with rewards. Examples of problems we address include: does player 1 have a strategy to ensure that the counter eventually hits 0, i.e., terminates, almost surely, regardless of what player 2 does? Or that the liminf (or limsup) counter value equals infinity with a desired probability? Or that the long run average reward is >0 with desired probability? We show that the qualitative termination problem for OC-SSGs is in NP intersection coNP, and is in P-time for 1-player OC-SSGs, or equivalently for one-counter Markov Decision Processes (OC-MDPs). Moreover, we show that quantitative limit problems for OC-SSGs are in NP intersection coNP, and are in P-time for 1-player OC-MDPs. Both qualitative limit problems and qualitative termination problems for OC-SSGs are already at least as hard as Condon's quantitative decision problem for finite-state SSGs.Comment: 20 pages, 1 figure. This is a full version of a paper accepted for publication in proceedings of FSTTCS 201

arXiv.org e-Print Archive

CiteSeerX

Edinburgh Research Explorer

Discrete-time rewards model-checked

Author: A. Aziz
A. Reibman
C. Baier
C. Baier
E. Clarke
G. Clark
H. Bohnenkamp
H. Hansson
J.P.M. Voeten
M. Bernardo
M.A. Qureshi
O. Sokolsky
R.M. Smith
V.G. Kulkarni
W.D. Obal
Publication venue: Springer Verlag
Publication date: 01/01/2003
Field of study

This paper presents a model-checking approach for analyzing discrete-time Markov reward models. For this purpose, the temporal logic probabilistic CTL is extended with reward constraints. This allows to formulate complex measures – involving expected as well as accumulated rewards – in a precise and succinct way. Algorithms to efficiently analyze such formulae are introduced. The approach is illustrated by model-checking a probabilistic cost model of the IPv4 zeroconf protocol for distributed address assignment in ad-hoc networks

Crossref

Repository TU/e

Pure OAI Repository

University of Twente Research Information

POMDPs under Probabilistic Semantics

Author: Chatterjee Krishnendu
Chmelik Martin
Publication venue
Publication date: 09/08/2014
Field of study

We consider partially observable Markov decision processes (POMDPs) with limit-average payoff, where a reward value in the interval [0,1] is associated to every transition, and the payoff of an infinite path is the long-run average of the rewards. We consider two types of path constraints: (i) quantitative constraint defines the set of paths where the payoff is at least a given threshold lambda_1 in (0,1]; and (ii) qualitative constraint which is a special case of quantitative constraint with lambda_1=1. We consider the computation of the almost-sure winning set, where the controller needs to ensure that the path constraint is satisfied with probability 1. Our main results for qualitative path constraint are as follows: (i) the problem of deciding the existence of a finite-memory controller is EXPTIME-complete; and (ii) the problem of deciding the existence of an infinite-memory controller is undecidable. For quantitative path constraint we show that the problem of deciding the existence of a finite-memory controller is undecidable.Comment: Appears in Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence (UAI2013

arXiv.org e-Print Archive

IST Austria: PubRep (Institute of Science and Technology)