    Optimal Strategies in Infinite-state Stochastic Reachability Games

    We consider perfect-information reachability stochastic games for 2 players on infinite graphs. We identify a subclass of such games, and prove two interesting properties of it: first, Player Max always has optimal strategies in games from this subclass, and second, these games are strongly determined. The subclass is defined by the property that the set of all values can only have one accumulation point -- 0. Our results nicely mirror recent results for finitely-branching games, where, on the contrary, Player Min always has optimal strategies. However, our proof methods are substantially different, because the roles of the players are not symmetric. We also do not restrict the branching of the games. Finally, we apply our results in the context of recently studied One-Counter stochastic games

    Minimizing Running Costs in Consumption Systems

    A standard approach to optimizing long-run running costs of discrete systems is based on minimizing the mean-payoff, i.e., the long-run average amount of resources ("energy") consumed per transition. However, this approach inherently assumes that the energy source has an unbounded capacity, which is not always realistic. For example, an autonomous robotic device has a battery of finite capacity that has to be recharged periodically, and the total amount of energy consumed between two successive charging cycles is bounded by the capacity. Hence, a controller minimizing the mean-payoff must obey this restriction. In this paper we study the controller synthesis problem for consumption systems with a finite battery capacity, where the task of the controller is to minimize the mean-payoff while preserving the functionality of the system encoded by a given linear-time property. We show that an optimal controller always exists, and it may either need only finite memory or require infinite memory (it is decidable in polynomial time which of the two cases holds). Further, we show how to compute an effective description of an optimal controller in polynomial time. Finally, we consider the limit values achievable by larger and larger battery capacity, show that these values are computable in polynomial time, and we also analyze the corresponding rate of convergence. To the best of our knowledge, these are the first results about optimizing the long-run running costs in systems with bounded energy stores.Comment: 32 pages, corrections of typos and minor omission

    Tableaux for Policy Synthesis for MDPs with PCTL* Constraints

    Markov decision processes (MDPs) are the standard formalism for modelling sequential decision making in stochastic environments. Policy synthesis addresses the problem of how to control or limit the decisions an agent makes so that a given specification is met. In this paper we consider PCTL*, the probabilistic counterpart of CTL*, as the specification language. Because in general the policy synthesis problem for PCTL* is undecidable, we restrict to policies whose execution history memory is finitely bounded a priori. Surprisingly, no algorithm for policy synthesis for this natural and expressive framework has been developed so far. We close this gap and describe a tableau-based algorithm that, given an MDP and a PCTL* specification, derives in a non-deterministic way a system of (possibly nonlinear) equalities and inequalities. The solutions of this system, if any, describe the desired (stochastic) policies. Our main result in this paper is the correctness of our method, i.e., soundness, completeness and termination.Comment: This is a long version of a conference paper published at TABLEAUX 2017. It contains proofs of the main results and fixes a bug. See the footnote on page 1 for detail

    One-Counter Stochastic Games

    We study the computational complexity of basic decision problems for one-counter simple stochastic games (OC-SSGs), under various objectives. OC-SSGs are 2-player turn-based stochastic games played on the transition graph of classic one-counter automata. We study primarily the termination objective, where the goal of one player is to maximize the probability of reaching counter value 0, while the other player wishes to avoid this. Partly motivated by the goal of understanding termination objectives, we also study certain "limit" and "long run average" reward objectives that are closely related to some well-studied objectives for stochastic games with rewards. Examples of problems we address include: does player 1 have a strategy to ensure that the counter eventually hits 0, i.e., terminates, almost surely, regardless of what player 2 does? Or that the liminf (or limsup) counter value equals infinity with a desired probability? Or that the long run average reward is >0 with desired probability? We show that the qualitative termination problem for OC-SSGs is in NP intersection coNP, and is in P-time for 1-player OC-SSGs, or equivalently for one-counter Markov Decision Processes (OC-MDPs). Moreover, we show that quantitative limit problems for OC-SSGs are in NP intersection coNP, and are in P-time for 1-player OC-MDPs. Both qualitative limit problems and qualitative termination problems for OC-SSGs are already at least as hard as Condon's quantitative decision problem for finite-state SSGs.Comment: 20 pages, 1 figure. This is a full version of a paper accepted for publication in proceedings of FSTTCS 201

    Trading Performance for Stability in Markov Decision Processes

    We study the complexity of central controller synthesis problems for finite-state Markov decision processes, where the objective is to optimize both the expected mean-payoff performance of the system and its stability. We argue that the basic theoretical notion of expressing the stability in terms of the variance of the mean-payoff (called global variance in our paper) is not always sufficient, since it ignores possible instabilities on respective runs. For this reason we propose alernative definitions of stability, which we call local and hybrid variance, and which express how rewards on each run deviate from the run's own mean-payoff and from the expected mean-payoff, respectively. We show that a strategy ensuring both the expected mean-payoff and the variance below given bounds requires randomization and memory, under all the above semantics of variance. We then look at the problem of determining whether there is a such a strategy. For the global variance, we show that the problem is in PSPACE, and that the answer can be approximated in pseudo-polynomial time. For the hybrid variance, the analogous decision problem is in NP, and a polynomial-time approximating algorithm also exists. For local variance, we show that the decision problem is in NP. Since the overall performance can be traded for stability (and vice versa), we also present algorithms for approximating the associated Pareto curve in all the three cases. Finally, we study a special case of the decision problems, where we require a given expected mean-payoff together with zero variance. Here we show that the problems can be all solved in polynomial time.Comment: Extended version of a paper presented at LICS 201

    Solving Stochastic B\"uchi Games on Infinite Arenas with a Finite Attractor

    We consider games played on an infinite probabilistic arena where the first player aims at satisfying generalized B\"uchi objectives almost surely, i.e., with probability one. We provide a fixpoint characterization of the winning sets and associated winning strategies in the case where the arena satisfies the finite-attractor property. From this we directly deduce the decidability of these games on probabilistic lossy channel systems.Comment: In Proceedings QAPL 2013, arXiv:1306.241

    Analyzing probabilistic pushdown automata

    The paper gives a summary of the existing results about algorithmic analysis of probabilistic pushdown automata and their subclasses.V článku je podán přehled známých výsledků o pravděpodobnostních zásobníkových automatech a některých jejich podtřídách