381 research outputs found

    Polynomial Time Algorithms for Branching Markov Decision Processes and Probabilistic Min(Max) Polynomial Bellman Equations

    Get PDF
    We show that one can approximate the least fixed point solution for a multivariate system of monotone probabilistic max(min) polynomial equations, referred to as maxPPSs (and minPPSs, respectively), in time polynomial in both the encoding size of the system of equations and in log(1/epsilon), where epsilon > 0 is the desired additive error bound of the solution. (The model of computation is the standard Turing machine model.) We establish this result using a generalization of Newton's method which applies to maxPPSs and minPPSs, even though the underlying functions are only piecewise-differentiable. This generalizes our recent work which provided a P-time algorithm for purely probabilistic PPSs. These equations form the Bellman optimality equations for several important classes of infinite-state Markov Decision Processes (MDPs). Thus, as a corollary, we obtain the first polynomial time algorithms for computing to within arbitrary desired precision the optimal value vector for several classes of infinite-state MDPs which arise as extensions of classic, and heavily studied, purely stochastic processes. These include both the problem of maximizing and mininizing the termination (extinction) probability of multi-type branching MDPs, stochastic context-free MDPs, and 1-exit Recursive MDPs. Furthermore, we also show that we can compute in P-time an epsilon-optimal policy for both maximizing and minimizing branching, context-free, and 1-exit-Recursive MDPs, for any given desired epsilon > 0. This is despite the fact that actually computing optimal strategies is Sqrt-Sum-hard and PosSLP-hard in this setting. We also derive, as an easy consequence of these results, an FNP upper bound on the complexity of computing the value (within arbitrary desired precision) of branching simple stochastic games (BSSGs)

    Greatest Fixed Points of Probabilistic Min/Max Polynomial Equations, and Reachability for Branching Markov Decision Processes?

    Get PDF
    We give polynomial time algorithms for quantitative (and qualitative) reachability analysis for Branching Markov Decision Processes (BMDPs). Specifically, given a BMDP, and given an initial population, where the objective of the controller is to maximize (or minimize) the probability of eventually reaching a population that contains an object of a desired (or undesired) type, we give algorithms for approximating the supremum (infimum) reachability probability, within desired precision epsilon > 0, in time polynomial in the encoding size of the BMDP and in log(1/epsilon). We furthermore give P-time algorithms for computing epsilon-optimal strategies for both maximization and minimization of reachability probabilities. We also give P-time algorithms for all associated qualitative analysis problems, namely: deciding whether the optimal (supremum or infimum) reachability probabilities are 0 or 1. Prior to this paper, approximation of optimal reachability probabilities for BMDPs was not even known to be decidable. Our algorithms exploit the following basic fact: we show that for any BMDP, its maximum (minimum) non-reachability probabilities are given by the greatest fixed point (GFP) solution g* in [0,1]^n of a corresponding monotone max (min) Probabilistic Polynomial System of equations (max/min-PPS), x=P(x), which are the Bellman optimality equations for a BMDP with non-reachability objectives. We show how to compute the GFP of max/min PPSs to desired precision in P-time. We also study more general Branching Simple Stochastic Games (BSSGs) with (non-)reachability objectives. We show that: (1) the value of these games is captured by the GFP of a corresponding max-minPPS; (2) the quantitative problem of approximating the value is in TFNP; and (3) the qualitative problems associated with the value are all solvable in P-time

    The complexity of analyzing infinite-state Markov chains, Markov decision processes, and stochastic games (Invited talk)

    Get PDF
    In recent years, a considerable amount of research has been devoted to understanding the computational complexity of basic analysis problems, and model checking problems, for finitely-presented countable infinite-state probabilistic systems. In particular, we have studied recursive Markov chains (RMCs), recursive Markov decision processes (RMDPs) and recursive stochastic games (RSGs). These arise by adding a natural recursion feature to finite-state Markov chains, MDPs, and stochastic games. RMCs and RMDPs provide natural abstract models of probabilistic procedural programs with recursion, and they are expressively equivalent to probabilistic and MDP extensions of pushdown automata. Moreover, a number of well-studied stochastic processes, including multi-type branching processes, (discrete-time) quasi-birth-death processes, and stochastic context-free grammars, can be suitably captured by subclasses of RMCs. A central computational problem for analyzing various classes of recursive probabilistic systems is the computation of their (optimal) termination probabilities. These form a key ingredient for many other analyses, including model checking. For RMCs, and for important subclasses of RMDPs and RSGs, computing their termination values is equivalent to computing the least fixed point (LFP) solution of a corresponding monotone system of polynomial (min/max) equations. The complexity of computing the LFP solution for such equation systems is a intriguing problem, with connections to several areas of research. The LFP solution may in general be irrational. So, one possible aim is to compute it to within a desired additive error epsilon > 0. For general RMCs, approximating their termination probability within any non-trivial constant additive error < 1/2, is at least as hard as long-standing open problems in the complexity of numerical computation which are not even known to be in NP. For several key subclasses of RMCs and RMDPs, computing their termination values turns out to be much more tractable. In this talk I will survey algorithms for, and discuss the computational complexity of, key analysis problems for classes of infinite-state recursive MCs, MDPs, and stochastic games. In particular, I will discuss recent joint work with Alistair Stewart and Mihalis Yannakakis (in papers that appeared at STOC\u2712 and ICALP\u2712), in which we have obtained polynomial time algorithms for computing, to within arbitrary desired precision, the LFP solution of probabilistic polynomial (min/max) systems of equations. Using this, we obtained the first P-time algorithms for computing (within desired precision) the extinction probabilities of multi-type branching processes, the probability that an arbitrary given stochastic context-free grammar generates a given string, and the optimum (maximum or minimum) extinction probabilities for branching MDPs and context-free MDPs. For branching MDPs, their corresponding equations amount to Bellman optimality equations for minimizing/maximizing their termination probabilities. Our algorithms combine variations and generalizations of Newton\u27s method with other techniques, including linear programming. The algorithms are fairly easy to implement, but analyzing their worst-case running time is mathematically quite involved

    Polynomial Time Algorithms for Multi-Type Branching Processes and Stochastic Context-Free Grammars

    Get PDF
    We show that one can approximate the least fixed point solution for a multivariate system of monotone probabilistic polynomial equations in time polynomial in both the encoding size of the system of equations and in log(1/\epsilon), where \epsilon > 0 is the desired additive error bound of the solution. (The model of computation is the standard Turing machine model.) We use this result to resolve several open problems regarding the computational complexity of computing key quantities associated with some classic and heavily studied stochastic processes, including multi-type branching processes and stochastic context-free grammars

    Value Iteration for Simple Stochastic Games: Stopping Criterion and Learning Algorithm

    Full text link
    Simple stochastic games can be solved by value iteration (VI), which yields a sequence of under-approximations of the value of the game. This sequence is guaranteed to converge to the value only in the limit. Since no stopping criterion is known, this technique does not provide any guarantees on its results. We provide the first stopping criterion for VI on simple stochastic games. It is achieved by additionally computing a convergent sequence of over-approximations of the value, relying on an analysis of the game graph. Consequently, VI becomes an anytime algorithm returning the approximation of the value and the current error bound. As another consequence, we can provide a simulation-based asynchronous VI algorithm, which yields the same guarantees, but without necessarily exploring the whole game graph.Comment: CAV201

    Reachability for Branching Concurrent Stochastic Games

    Get PDF
    We give polynomial time algorithms for deciding almost-sure and limit-sure reachability in Branching Concurrent Stochastic Games (BCSGs). These are a class of infinite-state imperfect-information stochastic games that generalize both finite-state concurrent stochastic reachability games ([L. de Alfaro et al., 2007]) and branching simple stochastic reachability games ([K. Etessami et al., 2018])

    Reachability analysis of branching probabilistic processes

    Get PDF
    We study a fundamental class of infinite-state stochastic processes and stochastic games, namely Branching Processes, under the properties of (single-target) reachability and multi-objective reachability. In particular, we study Branching Concurrent Stochastic Games (BCSGs), which are an imperfect-information game extension to the classical Branching Processes, and show that these games are determined, i.e., have a value, under the fundamental objective of reachability, building on and generalizing prior work on Branching Simple Stochastic Games and finite-state Concurrent Stochastic Games. We show that, unlike in the turn-based branching games, in the concurrent setting the almost-sure and limitsure reachability problems do not coincide and we give polynomial time algorithms for deciding both almost-sure and limit-sure reachability. We also provide a discussion on the complexity of quantitative reachability questions for BCSGs. Furthermore, we introduce a new model, namely Ordered Branching Processes (OBPs), which is a hybrid model between classical Branching Processes and Stochastic Context-Free Grammars. Under the reachability objective, this model is equivalent to the classical Branching Processes. We study qualitative multi-objective reachability questions for Ordered Branching Markov Decision Processes (OBMDPs), or equivalently context-free MDPs with simultaneous derivation. We provide algorithmic results for efficiently checking certain Boolean combinations of qualitative reachability and non-reachability queries with respect to different given target non-terminals. Among the more interesting multi-objective reachability results, we provide two separate algorithms for almost-sure and limit-sure multi-target reachability for OBMDPs. Specifically, given an OBMDP, given a starting non-terminal, and given a set of target non-terminals, our first algorithm decides whether the supremum probability, of generating a tree that contains every target non-terminal in the set, is 1. Our second algorithm decides whether there is a strategy for the player to almost-surely (with probability 1) generate a tree that contains every target non-terminal in the set. The two separate algorithms are needed: we show that indeed, in this context, almost-sure and limit-sure multi-target reachability do not coincide. Both algorithms run in time polynomial in the size of the OBMDP and exponential in the number of targets. Hence, they run in polynomial time when the number of targets is fixed. The algorithms are fixed-parameter tractable with respect to this number. Moreover, we show that the qualitative almost-sure (and limit-sure) multi-target reachability decision problem is in general NP-hard, when the size of the set of target non-terminals is not fixed
    • …
    corecore