Search CORE

20,289 research outputs found

Transient Reward Approximation for Continuous-Time Markov Chains

Author: Becker Bernd
Hahn Ernst Moritz
Hermanns Holger
Wimmer Ralf
Publication venue
Publication date: 01/01/2015
Field of study

We are interested in the analysis of very large continuous-time Markov chains (CTMCs) with many distinct rates. Such models arise naturally in the context of reliability analysis, e.g., of computer network performability analysis, of power grids, of computer virus vulnerability, and in the study of crowd dynamics. We use abstraction techniques together with novel algorithms for the computation of bounds on the expected final and accumulated rewards in continuous-time Markov decision processes (CTMDPs). These ingredients are combined in a partly symbolic and partly explicit (symblicit) analysis approach. In particular, we circumvent the use of multi-terminal decision diagrams, because the latter do not work well if facing a large number of different rates. We demonstrate the practical applicability and efficiency of the approach on two case studies.Comment: Accepted for publication in IEEE Transactions on Reliabilit

arXiv.org e-Print Archive

Queen's University Belfast Research Portal

Crossref

Institute Of Software, Chinese Academy Of Sciences

Quantitative Safety: Linking Proof-Based Verification with Model Checking for Probabilistic Systems

Author: Annabelle McIver
Caroll Morgan
Jasen Markovski
Manuel Núñez
Pedro D'Argenio
Pieter Cuijpers
Suzana Andova
Ukachukwu Ndukwu
Publication venue: 'Open Publishing Association'
Publication date: 01/01/2009
Field of study

This paper presents a novel approach for augmenting proof-based verification with performance-style analysis of the kind employed in state-of-the-art model checking tools for probabilistic systems. Quantitative safety properties usually specified as probabilistic system invariants and modeled in proof-based environments are evaluated using bounded model checking techniques. Our specific contributions include the statement of a theorem that is central to model checking safety properties of proof-based systems, the establishment of a procedure; and its full implementation in a prototype system (YAGA) which readily transforms a probabilistic model specified in a proof-based environment to its equivalent verifiable PRISM model equipped with reward structures. The reward structures capture the exact interpretation of the probabilistic invariants and can reveal succinct information about the model during experimental investigations. Finally, we demonstrate the novelty of the technique on a probabilistic library case study

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

Macquarie University ResearchOnline

LTLf/LDLf Non-Markovian Rewards

Author: Brafman RONEN ISRAEL
DE GIACOMO Giuseppe
Patrizi Fabio
Publication venue: AAAI Press
Publication date: 01/01/2018
Field of study

In Markov Decision Processes (MDPs), the reward obtained in a state is Markovian, i.e., depends on the last state and action. This dependency makes it difficult to reward more interesting long-term behaviors, such as always closing a door after it has been opened, or providing coffee only following a request. Extending MDPs to handle non-Markovian reward functions was the subject of two previous lines of work. Both use LTL variants to specify the reward function and then compile the new model back into a Markovian model. Building on recent progress in temporal logics over finite traces, we adopt LDLf for specifying non-Markovian rewards and provide an elegant automata construction for building a Markovian model, which extends that of previous work and offers strong minimality and compositionality guarantees

Archivio della ricerca- Università di Roma La Sapienza

Association for the Advancement of Artificial Intelligence: AAAI Publications

Value Iteration for Long-run Average Reward in Markov Decision Processes

Author: A Komuravelli
A McIver
AF Veinott
AK McIver
C Baier
C Courcoubetis
J Filar
K Chatterjee
K Chatterjee
K Chatterjee
K Chatterjee
M Duflot
M Kwiatkowska
M Kwiatkowska
M Kwiatkowska
ML Puterman
O Michael
RA Howard
S Giro
S Haddad
T Brázdil
T Brázdil
T Brázdil
Publication venue
Publication date: 13/07/2017
Field of study

Markov decision processes (MDPs) are standard models for probabilistic systems with non-deterministic behaviours. Long-run average rewards provide a mathematically elegant formalism for expressing long term performance. Value iteration (VI) is one of the simplest and most efficient algorithmic approaches to MDPs with other properties, such as reachability objectives. Unfortunately, a naive extension of VI does not work for MDPs with long-run average rewards, as there is no known stopping criterion. In this work our contributions are threefold. (1) We refute a conjecture related to stopping criteria for MDPs with long-run average rewards. (2) We present two practical algorithms for MDPs with long-run average rewards based on VI. First, we show that a combination of applying VI locally for each maximal end-component (MEC) and VI for reachability objectives can provide approximation guarantees. Second, extending the above approach with a simulation-guided on-demand variant of VI, we present an anytime algorithm that is able to deal with very large models. (3) Finally, we present experimental results showing that our methods significantly outperform the standard approaches on several benchmarks

arXiv.org e-Print Archive

Crossref

Lancaster E-Prints

Parameter-Independent Strategies for pMDPs via POMDPs

Author: A Lukina
C Baier
C Baier
C Daws
C Dehnert
C Dehnert
D Beyer
E Bartocci
E Polgreen
EM Hahn
EM Hahn
J Aspnes
K Chatterjee
K Chatterjee
K Chatterjee
K Chatterjee
LI Sennott
M Baldi
M Cubuktepe
M Kwiatkowska
MTJ Spaan
N Jansen
O Madani
PR Halmos
R Lanotte
S Pathak
S Russell
T Quatmann
V Kreinovich
Publication venue
Publication date: 01/01/2018
Field of study

Markov Decision Processes (MDPs) are a popular class of models suitable for solving control decision problems in probabilistic reactive systems. We consider parametric MDPs (pMDPs) that include parameters in some of the transition probabilities to account for stochastic uncertainties of the environment such as noise or input disturbances. We study pMDPs with reachability objectives where the parameter values are unknown and impossible to measure directly during execution, but there is a probability distribution known over the parameter values. We study for the first time computing parameter-independent strategies that are expectation optimal, i.e., optimize the expected reachability probability under the probability distribution over the parameters. We present an encoding of our problem to partially observable MDPs (POMDPs), i.e., a reduction of our problem to computing optimal strategies in POMDPs. We evaluate our method experimentally on several benchmarks: a motivating (repeated) learner model; a series of benchmarks of varying configurations of a robot moving on a grid; and a consensus protocol.Comment: Extended version of a QEST 2018 pape

arXiv.org e-Print Archive

Crossref

Publikationsserver der RWTH Aachen University

IST Austria: PubRep (Institute of Science and Technology)