18,988 research outputs found
A unified worst case for classical simplex and policy iteration pivot rules
We construct a family of Markov decision processes for which the policy
iteration algorithm needs an exponential number of improving switches with
Dantzig's rule, with Bland's rule, and with the Largest Increase pivot rule.
This immediately translates to a family of linear programs for which the
simplex algorithm needs an exponential number of pivot steps with the same
three pivot rules. Our results yield a unified construction that simultaneously
reproduces well-known lower bounds for these classical pivot rules, and we are
able to infer that any (deterministic or randomized) combination of them cannot
avoid an exponential worst-case behavior. Regarding the policy iteration
algorithm, pivot rules typically switch multiple edges simultaneously and our
lower bound for Dantzig's rule and the Largest Increase rule, which perform
only single switches, seem novel. Regarding the simplex algorithm, the
individual lower bounds were previously obtained separately via deformed
hypercube constructions. In contrast to previous bounds for the simplex
algorithm via Markov decision processes, our rigorous analysis is reasonably
concise
Exponential Lower Bounds for Solving Infinitary Payoff Games and Linear Programs
Parity games form an intriguing family of infinitary payoff games whose solution
is equivalent to the solution of important problems in automatic verification and
automata theory. They also form a very natural subclass of mean and discounted
payoff games, which in turn are very natural subclasses of turn-based stochastic
payoff games. From a theoretical point of view, solving these games is one of the few
problems that belong to the complexity class NP intersect coNP, and even more interestingly,
solving has been shown to belong to UP intersect coUP, and also to PLS. It is a major open
problem whether these game families can be solved in deterministic polynomial
time.
Policy iteration is one of the most important algorithmic schemes for solving
infinitary payoff games. It is parameterized by an improvement rule that determines
how to proceed in the iteration from one policy to the next. It is a major open problem
whether there is an improvement rule that results in a polynomial time algorithm for
solving one of the considered game classes.
Linear programming is one of the most important computational problems studied
by researchers in computer science, mathematics and operations research. Perhaps
more articles and books are written about linear programming than on all other
computational problems combined.
The simplex and the dual-simplex algorithms are among the most widely used
algorithms for solving linear programs in practice. Simplex algorithms for solving
linear programs are closely related to policy iteration algorithms. Like policy iteration,
the simplex algorithm is parameterized by a pivoting rule that describes how
to proceed from one basic feasible solution in the linear program to the next. It is
a major open problem whether there is a pivoting rule that results in a (strongly)
polynomial time algorithm for solving linear programs.
We contribute to both the policy iteration and the simplex algorithm by proving
exponential lower bounds for several improvement resp. pivoting rules. For every
considered improvement rule, we start by building 2-player parity games on which
the respective policy iteration algorithm performs an exponential number of iterations.
We then transform these 2-player games into 1-player Markov decision processes
ii
which correspond almost immediately to concrete linear programs on which the
respective simplex algorithm requires the same number of iterations. Additionally,
we show how to transfer the lower bound results to more expressive game classes
like payoff and turn-based stochastic games.
Particularly, we prove exponential lower bounds for the deterministic switch
all and switch best improvement rules for solving games, for which no non-trivial
lower bounds have been known since the introduction of Howard’s policy iteration
algorithm in 1960. Moreover, we prove exponential lower bounds for the two most
natural and most studied randomized pivoting rules suggested to date, namely the random
facet and random edge rules for solving games and linear programs, for which
no non-trivial lower bounds have been known for several decades. Furthermore, we
prove an exponential lower bound for the switch half randomized improvement rule
for solving games, which is considered to be the most important multi-switching
randomized rule. Finally, we prove an exponential lower bound for the most natural
and famous history-based pivoting rule due to Zadeh for solving games and linear
programs, which has been an open problem for thirty years.
Last but not least, we prove exponential lower bounds for two other classes of
algorithms that solve parity games, namely for the model checking algorithm due to
Stevens and Stirling and for the recursive algorithm by Zielonka
Algorithms and Conditional Lower Bounds for Planning Problems
We consider planning problems for graphs, Markov decision processes (MDPs),
and games on graphs. While graphs represent the most basic planning model, MDPs
represent interaction with nature and games on graphs represent interaction
with an adversarial environment. We consider two planning problems where there
are k different target sets, and the problems are as follows: (a) the coverage
problem asks whether there is a plan for each individual target set, and (b)
the sequential target reachability problem asks whether the targets can be
reached in sequence. For the coverage problem, we present a linear-time
algorithm for graphs and quadratic conditional lower bound for MDPs and games
on graphs. For the sequential target problem, we present a linear-time
algorithm for graphs, a sub-quadratic algorithm for MDPs, and a quadratic
conditional lower bound for games on graphs. Our results with conditional lower
bounds establish (i) model-separation results showing that for the coverage
problem MDPs and games on graphs are harder than graphs and for the sequential
reachability problem games on graphs are harder than MDPs and graphs; (ii)
objective-separation results showing that for MDPs the coverage problem is
harder than the sequential target problem.Comment: Accepted at ICAPS'1
The Complexity of the Simplex Method
The simplex method is a well-studied and widely-used pivoting method for
solving linear programs. When Dantzig originally formulated the simplex method,
he gave a natural pivot rule that pivots into the basis a variable with the
most violated reduced cost. In their seminal work, Klee and Minty showed that
this pivot rule takes exponential time in the worst case. We prove two main
results on the simplex method. Firstly, we show that it is PSPACE-complete to
find the solution that is computed by the simplex method using Dantzig's pivot
rule. Secondly, we prove that deciding whether Dantzig's rule ever chooses a
specific variable to enter the basis is PSPACE-complete. We use the known
connection between Markov decision processes (MDPs) and linear programming, and
an equivalence between Dantzig's pivot rule and a natural variant of policy
iteration for average-reward MDPs. We construct MDPs and show
PSPACE-completeness results for single-switch policy iteration, which in turn
imply our main results for the simplex method
Symmetric Strategy Improvement
Symmetry is inherent in the definition of most of the two-player zero-sum
games, including parity, mean-payoff, and discounted-payoff games. It is
therefore quite surprising that no symmetric analysis techniques for these
games exist. We develop a novel symmetric strategy improvement algorithm where,
in each iteration, the strategies of both players are improved simultaneously.
We show that symmetric strategy improvement defies Friedmann's traps, which
shook the belief in the potential of classic strategy improvement to be
polynomial
- …