Search CORE

270 research outputs found

Termination Criteria for Solving Concurrent Safety and Reachability Games

Author: Chatterjee Krishnendu
de Alfaro Luca
Henzinger Thomas A.
Publication venue
Publication date: 23/09/2008
Field of study

We consider concurrent games played on graphs. At every round of a game, each player simultaneously and independently selects a move; the moves jointly determine the transition to a successor state. Two basic objectives are the safety objective to stay forever in a given set of states, and its dual, the reachability objective to reach a given set of states. We present in this paper a strategy improvement algorithm for computing the value of a concurrent safety game, that is, the maximal probability with which player~1 can enforce the safety objective. The algorithm yields a sequence of player-1 strategies which ensure probabilities of winning that converge monotonically to the value of the safety game. Our result is significant because the strategy improvement algorithm provides, for the first time, a way to approximate the value of a concurrent safety game from below. Since a value iteration algorithm, or a strategy improvement algorithm for reachability games, can be used to approximate the same value from above, the combination of both algorithms yields a method for computing a converging sequence of upper and lower bounds for the values of concurrent reachability and safety games. Previous methods could approximate the values of these games only from one direction, and as no rates of convergence are known, they did not provide a practical way to solve these games

arXiv.org e-Print Archive

CiteSeerX

IST PubRep

Value Iteration for Long-run Average Reward in Markov Decision Processes

Author: A Komuravelli
A McIver
AF Veinott
AK McIver
C Baier
C Courcoubetis
J Filar
K Chatterjee
K Chatterjee
K Chatterjee
K Chatterjee
M Duflot
M Kwiatkowska
M Kwiatkowska
M Kwiatkowska
ML Puterman
O Michael
RA Howard
S Giro
S Haddad
T Brázdil
T Brázdil
T Brázdil
Publication venue
Publication date: 31/08/2017
Field of study

Markov decision processes (MDPs) are standard models for probabilistic systems with non-deterministic behaviours. Long-run average rewards provide a mathematically elegant formalism for expressing long term performance. Value iteration (VI) is one of the simplest and most efficient algorithmic approaches to MDPs with other properties, such as reachability objectives. Unfortunately, a naive extension of VI does not work for MDPs with long-run average rewards, as there is no known stopping criterion. In this work our contributions are threefold. (1) We refute a conjecture related to stopping criteria for MDPs with long-run average rewards. (2) We present two practical algorithms for MDPs with long-run average rewards based on VI. First, we show that a combination of applying VI locally for each maximal end-component (MEC) and VI for reachability objectives can provide approximation guarantees. Second, extending the above approach with a simulation-guided on-demand variant of VI, we present an anytime algorithm that is able to deal with very large models. (3) Finally, we present experimental results showing that our methods significantly outperform the standard approaches on several benchmarks

arXiv.org e-Print Archive

Crossref

Magnifying Lens Abstraction for Stochastic Games with Discounted and Long-run Average Objectives

Author: Chatterjee Krishnendu
de Alfaro Luca
Roy Pritam
Publication venue
Publication date: 01/01/2011
Field of study

Turn-based stochastic games and its important subclass Markov decision processes (MDPs) provide models for systems with both probabilistic and nondeterministic behaviors. We consider turn-based stochastic games with two classical quantitative objectives: discounted-sum and long-run average objectives. The game models and the quantitative objectives are widely used in probabilistic verification, planning, optimal inventory control, network protocol and performance analysis. Games and MDPs that model realistic systems often have very large state spaces, and probabilistic abstraction techniques are necessary to handle the state-space explosion. The commonly used full-abstraction techniques do not yield space-savings for systems that have many states with similar value, but does not necessarily have similar transition structure. A semi-abstraction technique, namely Magnifying-lens abstractions (MLA), that clusters states based on value only, disregarding differences in their transition relation was proposed for qualitative objectives (reachability and safety objectives). In this paper we extend the MLA technique to solve stochastic games with discounted-sum and long-run average objectives. We present the MLA technique based abstraction-refinement algorithm for stochastic games and MDPs with discounted-sum objectives. For long-run average objectives, our solution works for all MDPs and a sub-class of stochastic games where every state has the same value

arXiv.org e-Print Archive

IST Austria: PubRep (Institute of Science and Technology)

Strategy Improvement for Concurrent Safety Games

Author: Chatterjee Krishnendu
de Alfaro Luca
Henzinger Thomas A.
Publication venue
Publication date: 01/01/2008
Field of study

We consider concurrent games played on graphs. At every round of the game, each player simultaneously and independently selects a move; the moves jointly determine the transition to a successor state. Two basic objectives are the safety objective: ``stay forever in a set F of states'', and its dual, the reachability objective, ``reach a set R of states''. We present in this paper a strategy improvement algorithm for computing the value of a concurrent safety game, that is, the maximal probability with which player 1 can enforce the safety objective. The algorithm yields a sequence of player-1 strategies which ensure probabilities of winning that converge monotonically to the value of the safety game. The significance of the result is twofold. First, while strategy improvement algorithms were known for Markov decision processes and turn-based games, as well as for concurrent reachability games, this is the first strategy improvement algorithm for concurrent safety games. Second, and most importantly, the improvement algorithm provides a way to approximate the value of a concurrent safety game from below (the known value-iteration algorithms approximate the value from above). Thus, when used together with value-iteration algorithms, or with strategy improvement algorithms for reachability games, our algorithm leads to the first practical algorithm for computing converging upper and lower bounds for the value of reachability and safety games.Comment: 19 pages, 1 figur

arXiv.org e-Print Archive

CiteSeerX

Widest Paths and Global Propagation in Bounded Value Iteration for Stochastic Games

Author: A Condon
A McIver
AJ Hoffman
C Baier
C Courcoubetis
D Andersson
E Kelmendi
K Chatterjee
K Chatterjee
M Kwiatkowska
M Svorenová
ML Fredman
P Ashok
R Calinescu
S Haddad
T Brázdil
T Chen
T Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 18/09/2020
Field of study

Solving stochastic games with the reachability objective is a fundamental problem, especially in quantitative verification and synthesis. For this purpose, bounded value iteration (BVI) attracts attention as an efficient iterative method. However, BVI's performance is often impeded by costly end component (EC) computation that is needed to ensure convergence. Our contribution is a novel BVI algorithm that conducts, in addition to local propagation by the Bellman update that is typical of BVI, global propagation of upper bounds that is not hindered by ECs. To conduct global propagation in a computationally tractable manner, we construct a weighted graph and solve the widest path problem in it. Our experiments show the algorithm's performance advantage over the previous BVI algorithms that rely on EC computation.Comment: v2: an URL to the implementation is adde

arXiv.org e-Print Archive

Crossref

Approximating values of generalized-reachability stochastic games

Author: Baier Christel
Basset Nicolas
Brenguier Romain
Brázdil Tomás
Chatterjee Krishnendu
Chatterjee Krishnendu
Chen Taolue
Chen Taolue
Condon Anne
Forejt Vojtech
Kwiatkowska Marta
Kwiatkowska Marta Z.
Randour Mickael
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2020
Field of study

Simple stochastic games are turn-based 2½-player games with a reachability objective. The basic question asks whether one player can ensure reaching a given target with at least a given probability. A natural extension is games with a conjunction of such conditions as objective. Despite a plethora of recent results on the analysis of systems with multiple objectives, the decidability of this basic problem remains open. In this paper, we present an algorithm approximating the Pareto frontier of the achievable values to a given precision. Moreover, it is an anytime algorithm, meaning it can be stopped at any time returning the current approximation and its error bound

arXiv.org e-Print Archive

Crossref

IST Austria: PubRep (Institute of Science and Technology)

Publikationsserver der RWTH Aachen University

Stochastic Games with Disjunctions of Multiple Objectives (Technical Report)

Author: Weininger Maximilian
Winkler Tobias
Publication venue: 'Open Publishing Association'
Publication date: 01/01/2021
Field of study

Stochastic games combine controllable and adversarial non-determinism with stochastic behavior and are a common tool in control, verification and synthesis of reactive systems facing uncertainty. Multi-objective stochastic games are natural in situations where several - possibly conflicting - performance criteria like time and energy consumption are relevant. Such conjunctive combinations are the most studied multi-objective setting in the literature. In this paper, we consider the dual disjunctive problem. More concretely, we study turn-based stochastic two-player games on graphs where the winning condition is to guarantee at least one reachability or safety objective from a given set of alternatives. We present a fine-grained overview of strategy and computational complexity of such \emph{disjunctive queries} (DQs) and provide new lower and upper bounds for several variants of the problem, significantly extending previous works. We also propose a novel value iteration-style algorithm for approximating the set of Pareto optimal thresholds for a given DQ.Comment: Technical report including appendix with detailed proofs, 29 page

arXiv.org e-Print Archive

Publikationsserver der RWTH Aachen University