52 research outputs found

    Exponential Lower Bounds for Solving Infinitary Payoff Games and Linear Programs

    Get PDF
    Parity games form an intriguing family of infinitary payoff games whose solution is equivalent to the solution of important problems in automatic verification and automata theory. They also form a very natural subclass of mean and discounted payoff games, which in turn are very natural subclasses of turn-based stochastic payoff games. From a theoretical point of view, solving these games is one of the few problems that belong to the complexity class NP intersect coNP, and even more interestingly, solving has been shown to belong to UP intersect coUP, and also to PLS. It is a major open problem whether these game families can be solved in deterministic polynomial time. Policy iteration is one of the most important algorithmic schemes for solving infinitary payoff games. It is parameterized by an improvement rule that determines how to proceed in the iteration from one policy to the next. It is a major open problem whether there is an improvement rule that results in a polynomial time algorithm for solving one of the considered game classes. Linear programming is one of the most important computational problems studied by researchers in computer science, mathematics and operations research. Perhaps more articles and books are written about linear programming than on all other computational problems combined. The simplex and the dual-simplex algorithms are among the most widely used algorithms for solving linear programs in practice. Simplex algorithms for solving linear programs are closely related to policy iteration algorithms. Like policy iteration, the simplex algorithm is parameterized by a pivoting rule that describes how to proceed from one basic feasible solution in the linear program to the next. It is a major open problem whether there is a pivoting rule that results in a (strongly) polynomial time algorithm for solving linear programs. We contribute to both the policy iteration and the simplex algorithm by proving exponential lower bounds for several improvement resp. pivoting rules. For every considered improvement rule, we start by building 2-player parity games on which the respective policy iteration algorithm performs an exponential number of iterations. We then transform these 2-player games into 1-player Markov decision processes ii which correspond almost immediately to concrete linear programs on which the respective simplex algorithm requires the same number of iterations. Additionally, we show how to transfer the lower bound results to more expressive game classes like payoff and turn-based stochastic games. Particularly, we prove exponential lower bounds for the deterministic switch all and switch best improvement rules for solving games, for which no non-trivial lower bounds have been known since the introduction of Howard’s policy iteration algorithm in 1960. Moreover, we prove exponential lower bounds for the two most natural and most studied randomized pivoting rules suggested to date, namely the random facet and random edge rules for solving games and linear programs, for which no non-trivial lower bounds have been known for several decades. Furthermore, we prove an exponential lower bound for the switch half randomized improvement rule for solving games, which is considered to be the most important multi-switching randomized rule. Finally, we prove an exponential lower bound for the most natural and famous history-based pivoting rule due to Zadeh for solving games and linear programs, which has been an open problem for thirty years. Last but not least, we prove exponential lower bounds for two other classes of algorithms that solve parity games, namely for the model checking algorithm due to Stevens and Stirling and for the recursive algorithm by Zielonka

    The Worst-Case Complexity of Symmetric Strategy Improvement

    Full text link
    Symmetric strategy improvement is an algorithm introduced by Schewe et al. (ICALP 2015) that can be used to solve two-player games on directed graphs such as parity games and mean payoff games. In contrast to the usual well-known strategy improvement algorithm, it iterates over strategies of both players simultaneously. The symmetric version solves the known worst-case examples for strategy improvement quickly, however its worst-case complexity remained open. We present a class of worst-case examples for symmetric strategy improvement on which this symmetric version also takes exponentially many steps. Remarkably, our examples exhibit this behaviour for any choice of improvement rule, which is in contrast to classical strategy improvement where hard instances are usually hand-crafted for a specific improvement rule. We present a generalized version of symmetric strategy iteration depending less rigidly on the interplay of the strategies of both players. However, it turns out it has the same shortcomings

    A Parity Game Tale of Two Counters

    Get PDF
    Parity games are simple infinite games played on finite graphs with a winning condition that is expressive enough to capture nested least and greatest fixpoints. Through their tight relationship to the modal mu-calculus, they are used in practice for the model-checking and synthesis problems of the mu-calculus and related temporal logics like LTL and CTL. Solving parity games is a compelling complexity theoretic problem, as the problem lies in the intersection of UP and co-UP and is believed to admit a polynomial-time solution, motivating researchers to either find such a solution or to find superpolynomial lower bounds for existing algorithms to improve the understanding of parity games. We present a parameterized parity game called the Two Counters game, which provides an exponential lower bound for a wide range of attractor-based parity game solving algorithms. We are the first to provide an exponential lower bound to priority promotion with the delayed promotion policy, and the first to provide such a lower bound to tangle learning.Comment: In Proceedings GandALF 2019, arXiv:1909.0597

    Computer Science Logic 2018: CSL 2018, September 4-8, 2018, Birmingham, United Kingdom

    Get PDF

    Computing Probabilistic Bisimilarity Distances via Policy Iteration

    Get PDF
    A transformation mapping a labelled Markov chain to a simple stochastic game is presented. In the resulting simple stochastic game, each vertex corresponds to a pair of states of the labelled Markov chain. The value of a vertex of the simple stochastic game is shown to be equal to the probabilistic bisimilarity distance, a notion due to Desharnais, Gupta, Jagadeesan and Panangaden, of the corresponding pair of states of the labelled Markov chain. Bacci, Bacci, Larsen and Mardare introduced an algorithm to compute the probabilistic bisimilarity distances for a labelled Markov chain. A modification of a basic version of their algorithm for a labelled Markov chain is shown to be the policy iteration algorithm applied to the corresponding simple stochastic game. Furthermore, it is shown that this algorithm takes exponential time in the worst case

    Discounted-Sum Automata with Multiple Discount Factors

    Get PDF
    Discounting the influence of future events is a key paradigm in economics and it is widely used in computer-science models, such as games, Markov decision processes (MDPs), reinforcement learning, and automata. While a single game or MDP may allow for several different discount factors, discounted-sum automata (NDAs) were only studied with respect to a single discount factor. For every integer ? ? ??{0,1}, as opposed to every ? ? ???, the class of NDAs with discount factor ? (?-NDAs) has good computational properties: it is closed under determinization and under the algebraic operations min, max, addition, and subtraction, and there are algorithms for its basic decision problems, such as automata equivalence and containment. We define and analyze discounted-sum automata in which each transition can have a different integral discount factor (integral NMDAs). We show that integral NMDAs with an arbitrary choice of discount factors are not closed under determinization and under algebraic operations. We then define and analyze a restricted class of integral NMDAs, which we call tidy NMDAs, in which the choice of discount factors depends on the prefix of the word read so far. Tidy NMDAs are as expressive as deterministic integral NMDAs with an arbitrary choice of discount factors, and some of their special cases are NMDAs in which the discount factor depends on the action (alphabet letter) or on the elapsed time. We show that for every function ? that defines the choice of discount factors, the class of ?-NMDAs enjoys all of the above good properties of integral NDAs, as well as the same complexities of the required decision problems. To this end, we also improve the previously known complexities of the decision problems of integral NDAs, and present tight bounds on the size blow-up involved in algebraic operations on them. All our results hold equally for automata on finite words and for automata on infinite words

    A unified worst case for classical simplex and policy iteration pivot rules

    Full text link
    We construct a family of Markov decision processes for which the policy iteration algorithm needs an exponential number of improving switches with Dantzig's rule, with Bland's rule, and with the Largest Increase pivot rule. This immediately translates to a family of linear programs for which the simplex algorithm needs an exponential number of pivot steps with the same three pivot rules. Our results yield a unified construction that simultaneously reproduces well-known lower bounds for these classical pivot rules, and we are able to infer that any (deterministic or randomized) combination of them cannot avoid an exponential worst-case behavior. Regarding the policy iteration algorithm, pivot rules typically switch multiple edges simultaneously and our lower bound for Dantzig's rule and the Largest Increase rule, which perform only single switches, seem novel. Regarding the simplex algorithm, the individual lower bounds were previously obtained separately via deformed hypercube constructions. In contrast to previous bounds for the simplex algorithm via Markov decision processes, our rigorous analysis is reasonably concise
    • …
    corecore