2,879 research outputs found

    Recursive Concurrent Stochastic Games

    Get PDF
    We study Recursive Concurrent Stochastic Games (RCSGs), extending our recent analysis of recursive simple stochastic games to a concurrent setting where the two players choose moves simultaneously and independently at each state. For multi-exit games, our earlier work already showed undecidability for basic questions like termination, thus we focus on the important case of single-exit RCSGs (1-RCSGs). We first characterize the value of a 1-RCSG termination game as the least fixed point solution of a system of nonlinear minimax functional equations, and use it to show PSPACE decidability for the quantitative termination problem. We then give a strategy improvement technique, which we use to show that player 1 (maximizer) has \epsilon-optimal randomized Stackless & Memoryless (r-SM) strategies for all \epsilon > 0, while player 2 (minimizer) has optimal r-SM strategies. Thus, such games are r-SM-determined. These results mirror and generalize in a strong sense the randomized memoryless determinacy results for finite stochastic games, and extend the classic Hoffman-Karp strategy improvement approach from the finite to an infinite state setting. The proofs in our infinite-state setting are very different however, relying on subtle analytic properties of certain power series that arise from studying 1-RCSGs. We show that our upper bounds, even for qualitative (probability 1) termination, can not be improved, even to NP, without a major breakthrough, by giving two reductions: first a P-time reduction from the long-standing square-root sum problem to the quantitative termination decision problem for finite concurrent stochastic games, and then a P-time reduction from the latter problem to the qualitative termination problem for 1-RCSGs.Comment: 21 pages, 2 figure

    One-Counter Stochastic Games

    Get PDF
    We study the computational complexity of basic decision problems for one-counter simple stochastic games (OC-SSGs), under various objectives. OC-SSGs are 2-player turn-based stochastic games played on the transition graph of classic one-counter automata. We study primarily the termination objective, where the goal of one player is to maximize the probability of reaching counter value 0, while the other player wishes to avoid this. Partly motivated by the goal of understanding termination objectives, we also study certain "limit" and "long run average" reward objectives that are closely related to some well-studied objectives for stochastic games with rewards. Examples of problems we address include: does player 1 have a strategy to ensure that the counter eventually hits 0, i.e., terminates, almost surely, regardless of what player 2 does? Or that the liminf (or limsup) counter value equals infinity with a desired probability? Or that the long run average reward is >0 with desired probability? We show that the qualitative termination problem for OC-SSGs is in NP intersection coNP, and is in P-time for 1-player OC-SSGs, or equivalently for one-counter Markov Decision Processes (OC-MDPs). Moreover, we show that quantitative limit problems for OC-SSGs are in NP intersection coNP, and are in P-time for 1-player OC-MDPs. Both qualitative limit problems and qualitative termination problems for OC-SSGs are already at least as hard as Condon's quantitative decision problem for finite-state SSGs.Comment: 20 pages, 1 figure. This is a full version of a paper accepted for publication in proceedings of FSTTCS 201

    Equilibria, Fixed Points, and Complexity Classes

    Get PDF
    Many models from a variety of areas involve the computation of an equilibrium or fixed point of some kind. Examples include Nash equilibria in games; market equilibria; computing optimal strategies and the values of competitive games (stochastic and other games); stable configurations of neural networks; analysing basic stochastic models for evolution like branching processes and for language like stochastic context-free grammars; and models that incorporate the basic primitives of probability and recursion like recursive Markov chains. It is not known whether these problems can be solved in polynomial time. There are certain common computational principles underlying different types of equilibria, which are captured by the complexity classes PLS, PPAD, and FIXP. Representative complete problems for these classes are respectively, pure Nash equilibria in games where they are guaranteed to exist, (mixed) Nash equilibria in 2-player normal form games, and (mixed) Nash equilibria in normal form games with 3 (or more) players. This paper reviews the underlying computational principles and the corresponding classes

    Polynomial Time Algorithms for Branching Markov Decision Processes and Probabilistic Min(Max) Polynomial Bellman Equations

    Get PDF
    We show that one can approximate the least fixed point solution for a multivariate system of monotone probabilistic max(min) polynomial equations, referred to as maxPPSs (and minPPSs, respectively), in time polynomial in both the encoding size of the system of equations and in log(1/epsilon), where epsilon > 0 is the desired additive error bound of the solution. (The model of computation is the standard Turing machine model.) We establish this result using a generalization of Newton's method which applies to maxPPSs and minPPSs, even though the underlying functions are only piecewise-differentiable. This generalizes our recent work which provided a P-time algorithm for purely probabilistic PPSs. These equations form the Bellman optimality equations for several important classes of infinite-state Markov Decision Processes (MDPs). Thus, as a corollary, we obtain the first polynomial time algorithms for computing to within arbitrary desired precision the optimal value vector for several classes of infinite-state MDPs which arise as extensions of classic, and heavily studied, purely stochastic processes. These include both the problem of maximizing and mininizing the termination (extinction) probability of multi-type branching MDPs, stochastic context-free MDPs, and 1-exit Recursive MDPs. Furthermore, we also show that we can compute in P-time an epsilon-optimal policy for both maximizing and minimizing branching, context-free, and 1-exit-Recursive MDPs, for any given desired epsilon > 0. This is despite the fact that actually computing optimal strategies is Sqrt-Sum-hard and PosSLP-hard in this setting. We also derive, as an easy consequence of these results, an FNP upper bound on the complexity of computing the value (within arbitrary desired precision) of branching simple stochastic games (BSSGs)

    PReMo : An Analyzer for P robabilistic Re cursive Mo dels

    Get PDF
    This paper describes PReMo, a tool for analyzing Recursive Markov Chains, and their controlled/game extensions: (1-exit) Recursive Markov Decision Processes and Recursive Simple Stochastic Games

    Approximating the Termination Value of One-Counter MDPs and Stochastic Games

    Get PDF
    One-counter MDPs (OC-MDPs) and one-counter simple stochastic games (OC-SSGs) are 1-player, and 2-player turn-based zero-sum, stochastic games played on the transition graph of classic one-counter automata (equivalently, pushdown automata with a 1-letter stack alphabet). A key objective for the analysis and verification of these games is the termination objective, where the players aim to maximize (minimize, respectively) the probability of hitting counter value 0, starting at a given control state and given counter value. Recently, we studied qualitative decision problems ("is the optimal termination value = 1?") for OC-MDPs (and OC-SSGs) and showed them to be decidable in P-time (in NP and coNP, respectively). However, quantitative decision and approximation problems ("is the optimal termination value ? p", or "approximate the termination value within epsilon") are far more challenging. This is so in part because optimal strategies may not exist, and because even when they do exist they can have a highly non-trivial structure. It thus remained open even whether any of these quantitative termination problems are computable. In this paper we show that all quantitative approximation problems for the termination value for OC-MDPs and OC-SSGs are computable. Specifically, given a OC-SSG, and given epsilon > 0, we can compute a value v that approximates the value of the OC-SSG termination game within additive error epsilon, and furthermore we can compute epsilon-optimal strategies for both players in the game. A key ingredient in our proofs is a subtle martingale, derived from solving certain LPs that we can associate with a maximizing OC-MDP. An application of Azuma's inequality on these martingales yields a computable bound for the "wealth" at which a "rich person's strategy" becomes epsilon-optimal for OC-MDPs.Comment: 35 pages, 1 figure, full version of a paper presented at ICALP 2011, invited for submission to Information and Computatio

    Qualitative Analysis of Concurrent Mean-payoff Games

    Get PDF
    We consider concurrent games played by two-players on a finite-state graph, where in every round the players simultaneously choose a move, and the current state along with the joint moves determine the successor state. We study a fundamental objective, namely, mean-payoff objective, where a reward is associated to each transition, and the goal of player 1 is to maximize the long-run average of the rewards, and the objective of player 2 is strictly the opposite. The path constraint for player 1 could be qualitative, i.e., the mean-payoff is the maximal reward, or arbitrarily close to it; or quantitative, i.e., a given threshold between the minimal and maximal reward. We consider the computation of the almost-sure (resp. positive) winning sets, where player 1 can ensure that the path constraint is satisfied with probability 1 (resp. positive probability). Our main results for qualitative path constraints are as follows: (1) we establish qualitative determinacy results that show that for every state either player 1 has a strategy to ensure almost-sure (resp. positive) winning against all player-2 strategies, or player 2 has a spoiling strategy to falsify almost-sure (resp. positive) winning against all player-1 strategies; (2) we present optimal strategy complexity results that precisely characterize the classes of strategies required for almost-sure and positive winning for both players; and (3) we present quadratic time algorithms to compute the almost-sure and the positive winning sets, matching the best known bound of algorithms for much simpler problems (such as reachability objectives). For quantitative constraints we show that a polynomial time solution for the almost-sure or the positive winning set would imply a solution to a long-standing open problem (the value problem for turn-based deterministic mean-payoff games) that is not known to be solvable in polynomial time

    Qualitative Reachability in Stochastic BPA Games

    Get PDF
    We consider a class of infinite-state stochastic games generated by stateless pushdown automata (or, equivalently, 1-exit recursive state machines), where the winning objective is specified by a regular set of target configurations and a qualitative probability constraint `>0' or `=1'. The goal of one player is to maximize the probability of reaching the target set so that the constraint is satisfied, while the other player aims at the opposite. We show that the winner in such games can be determined in PTIME for the `>0' constraint, and both in NP and coNP for the `=1' constraint. Further, we prove that the winning regions for both players are regular, and we design algorithms which compute the associated finite-state automata. Finally, we show that winning strategies can be synthesized effectively.Comment: Submitted to Information and Computation. 48 pages, 3 figure

    One-Counter Stochastic Games

    Get PDF
    We study the computational complexity of basic decision problems for one-counter simple stochastic games (OC-SSGs), under various objectives. OC-SSGs are 2-player turn-based stochastic games played on the transition graph of classic one-counter automata. We study primarily the termination objective, where the goal of one player is to maximize the probability of reaching counter value 0, while the other player wishes to avoid this. Partly motivated by the goal of understanding termination objectives, we also study certain ``limit\u27\u27 and ``long run average\u27\u27 reward objectives that are closely related to some well-studied objectives for stochastic games with rewards. Examples of problems we address include: does player 1 have a strategy to ensure that the counter eventually hits 0, i.e., terminates, almost surely, regardless of what player 2 does? Or that the liminfliminf (or limsuplimsup) counter value equals inftyinfty with a desired probability? Or that the long run average reward is >0>0 with desired probability? We show that the qualitative termination problem for OC-SSGs is in NPNP intersect coNPcoNP, and is in P-time for 1-player OC-SSGs, or equivalently for one-counter Markov Decision Processes (OC-MDPs). Moreover, we show that quantitative limit problems for OC-SSGs are in NPNP intersect coNPcoNP, and are in P-time for 1-player OC-MDPs. Both qualitative limit problems and qualitative termination problems for OC-SSGs are already at least as hard as Condon\u27s quantitative decision problem for finite-state SSGs
    • 

    corecore