18,988 research outputs found

    A unified worst case for classical simplex and policy iteration pivot rules

    Full text link
    We construct a family of Markov decision processes for which the policy iteration algorithm needs an exponential number of improving switches with Dantzig's rule, with Bland's rule, and with the Largest Increase pivot rule. This immediately translates to a family of linear programs for which the simplex algorithm needs an exponential number of pivot steps with the same three pivot rules. Our results yield a unified construction that simultaneously reproduces well-known lower bounds for these classical pivot rules, and we are able to infer that any (deterministic or randomized) combination of them cannot avoid an exponential worst-case behavior. Regarding the policy iteration algorithm, pivot rules typically switch multiple edges simultaneously and our lower bound for Dantzig's rule and the Largest Increase rule, which perform only single switches, seem novel. Regarding the simplex algorithm, the individual lower bounds were previously obtained separately via deformed hypercube constructions. In contrast to previous bounds for the simplex algorithm via Markov decision processes, our rigorous analysis is reasonably concise

    Exponential Lower Bounds for Solving Infinitary Payoff Games and Linear Programs

    Get PDF
    Parity games form an intriguing family of infinitary payoff games whose solution is equivalent to the solution of important problems in automatic verification and automata theory. They also form a very natural subclass of mean and discounted payoff games, which in turn are very natural subclasses of turn-based stochastic payoff games. From a theoretical point of view, solving these games is one of the few problems that belong to the complexity class NP intersect coNP, and even more interestingly, solving has been shown to belong to UP intersect coUP, and also to PLS. It is a major open problem whether these game families can be solved in deterministic polynomial time. Policy iteration is one of the most important algorithmic schemes for solving infinitary payoff games. It is parameterized by an improvement rule that determines how to proceed in the iteration from one policy to the next. It is a major open problem whether there is an improvement rule that results in a polynomial time algorithm for solving one of the considered game classes. Linear programming is one of the most important computational problems studied by researchers in computer science, mathematics and operations research. Perhaps more articles and books are written about linear programming than on all other computational problems combined. The simplex and the dual-simplex algorithms are among the most widely used algorithms for solving linear programs in practice. Simplex algorithms for solving linear programs are closely related to policy iteration algorithms. Like policy iteration, the simplex algorithm is parameterized by a pivoting rule that describes how to proceed from one basic feasible solution in the linear program to the next. It is a major open problem whether there is a pivoting rule that results in a (strongly) polynomial time algorithm for solving linear programs. We contribute to both the policy iteration and the simplex algorithm by proving exponential lower bounds for several improvement resp. pivoting rules. For every considered improvement rule, we start by building 2-player parity games on which the respective policy iteration algorithm performs an exponential number of iterations. We then transform these 2-player games into 1-player Markov decision processes ii which correspond almost immediately to concrete linear programs on which the respective simplex algorithm requires the same number of iterations. Additionally, we show how to transfer the lower bound results to more expressive game classes like payoff and turn-based stochastic games. Particularly, we prove exponential lower bounds for the deterministic switch all and switch best improvement rules for solving games, for which no non-trivial lower bounds have been known since the introduction of Howard’s policy iteration algorithm in 1960. Moreover, we prove exponential lower bounds for the two most natural and most studied randomized pivoting rules suggested to date, namely the random facet and random edge rules for solving games and linear programs, for which no non-trivial lower bounds have been known for several decades. Furthermore, we prove an exponential lower bound for the switch half randomized improvement rule for solving games, which is considered to be the most important multi-switching randomized rule. Finally, we prove an exponential lower bound for the most natural and famous history-based pivoting rule due to Zadeh for solving games and linear programs, which has been an open problem for thirty years. Last but not least, we prove exponential lower bounds for two other classes of algorithms that solve parity games, namely for the model checking algorithm due to Stevens and Stirling and for the recursive algorithm by Zielonka

    Algorithms and Conditional Lower Bounds for Planning Problems

    Full text link
    We consider planning problems for graphs, Markov decision processes (MDPs), and games on graphs. While graphs represent the most basic planning model, MDPs represent interaction with nature and games on graphs represent interaction with an adversarial environment. We consider two planning problems where there are k different target sets, and the problems are as follows: (a) the coverage problem asks whether there is a plan for each individual target set, and (b) the sequential target reachability problem asks whether the targets can be reached in sequence. For the coverage problem, we present a linear-time algorithm for graphs and quadratic conditional lower bound for MDPs and games on graphs. For the sequential target problem, we present a linear-time algorithm for graphs, a sub-quadratic algorithm for MDPs, and a quadratic conditional lower bound for games on graphs. Our results with conditional lower bounds establish (i) model-separation results showing that for the coverage problem MDPs and games on graphs are harder than graphs and for the sequential reachability problem games on graphs are harder than MDPs and graphs; (ii) objective-separation results showing that for MDPs the coverage problem is harder than the sequential target problem.Comment: Accepted at ICAPS'1

    The Complexity of the Simplex Method

    Get PDF
    The simplex method is a well-studied and widely-used pivoting method for solving linear programs. When Dantzig originally formulated the simplex method, he gave a natural pivot rule that pivots into the basis a variable with the most violated reduced cost. In their seminal work, Klee and Minty showed that this pivot rule takes exponential time in the worst case. We prove two main results on the simplex method. Firstly, we show that it is PSPACE-complete to find the solution that is computed by the simplex method using Dantzig's pivot rule. Secondly, we prove that deciding whether Dantzig's rule ever chooses a specific variable to enter the basis is PSPACE-complete. We use the known connection between Markov decision processes (MDPs) and linear programming, and an equivalence between Dantzig's pivot rule and a natural variant of policy iteration for average-reward MDPs. We construct MDPs and show PSPACE-completeness results for single-switch policy iteration, which in turn imply our main results for the simplex method

    Symmetric Strategy Improvement

    Full text link
    Symmetry is inherent in the definition of most of the two-player zero-sum games, including parity, mean-payoff, and discounted-payoff games. It is therefore quite surprising that no symmetric analysis techniques for these games exist. We develop a novel symmetric strategy improvement algorithm where, in each iteration, the strategies of both players are improved simultaneously. We show that symmetric strategy improvement defies Friedmann's traps, which shook the belief in the potential of classic strategy improvement to be polynomial
    corecore