381 research outputs found
Polynomial Time Algorithms for Branching Markov Decision Processes and Probabilistic Min(Max) Polynomial Bellman Equations
We show that one can approximate the least fixed point solution for a
multivariate system of monotone probabilistic max(min) polynomial equations,
referred to as maxPPSs (and minPPSs, respectively), in time polynomial in both
the encoding size of the system of equations and in log(1/epsilon), where
epsilon > 0 is the desired additive error bound of the solution. (The model of
computation is the standard Turing machine model.) We establish this result
using a generalization of Newton's method which applies to maxPPSs and minPPSs,
even though the underlying functions are only piecewise-differentiable. This
generalizes our recent work which provided a P-time algorithm for purely
probabilistic PPSs.
These equations form the Bellman optimality equations for several important
classes of infinite-state Markov Decision Processes (MDPs). Thus, as a
corollary, we obtain the first polynomial time algorithms for computing to
within arbitrary desired precision the optimal value vector for several classes
of infinite-state MDPs which arise as extensions of classic, and heavily
studied, purely stochastic processes. These include both the problem of
maximizing and mininizing the termination (extinction) probability of
multi-type branching MDPs, stochastic context-free MDPs, and 1-exit Recursive
MDPs.
Furthermore, we also show that we can compute in P-time an epsilon-optimal
policy for both maximizing and minimizing branching, context-free, and
1-exit-Recursive MDPs, for any given desired epsilon > 0. This is despite the
fact that actually computing optimal strategies is Sqrt-Sum-hard and
PosSLP-hard in this setting.
We also derive, as an easy consequence of these results, an FNP upper bound
on the complexity of computing the value (within arbitrary desired precision)
of branching simple stochastic games (BSSGs)
Greatest Fixed Points of Probabilistic Min/Max Polynomial Equations, and Reachability for Branching Markov Decision Processes?
We give polynomial time algorithms for quantitative (and qualitative)
reachability analysis for Branching Markov Decision Processes (BMDPs).
Specifically, given a BMDP, and given an initial population, where the
objective of the controller is to maximize (or minimize) the probability of
eventually reaching a population that contains an object of a desired (or
undesired) type, we give algorithms for approximating the supremum (infimum)
reachability probability, within desired precision epsilon > 0, in time
polynomial in the encoding size of the BMDP and in log(1/epsilon). We
furthermore give P-time algorithms for computing epsilon-optimal strategies for
both maximization and minimization of reachability probabilities. We also give
P-time algorithms for all associated qualitative analysis problems, namely:
deciding whether the optimal (supremum or infimum) reachability probabilities
are 0 or 1. Prior to this paper, approximation of optimal reachability
probabilities for BMDPs was not even known to be decidable.
Our algorithms exploit the following basic fact: we show that for any BMDP,
its maximum (minimum) non-reachability probabilities are given by the greatest
fixed point (GFP) solution g* in [0,1]^n of a corresponding monotone max (min)
Probabilistic Polynomial System of equations (max/min-PPS), x=P(x), which are
the Bellman optimality equations for a BMDP with non-reachability objectives.
We show how to compute the GFP of max/min PPSs to desired precision in P-time.
We also study more general Branching Simple Stochastic Games (BSSGs) with
(non-)reachability objectives. We show that: (1) the value of these games is
captured by the GFP of a corresponding max-minPPS; (2) the quantitative problem
of approximating the value is in TFNP; and (3) the qualitative problems
associated with the value are all solvable in P-time
The complexity of analyzing infinite-state Markov chains, Markov decision processes, and stochastic games (Invited talk)
In recent years, a considerable amount of research has been devoted to understanding the computational complexity of basic analysis problems, and model checking problems, for finitely-presented countable infinite-state probabilistic systems. In particular, we have studied recursive Markov chains (RMCs), recursive Markov decision processes (RMDPs) and recursive stochastic games (RSGs). These arise by adding a natural recursion feature to finite-state Markov chains, MDPs, and stochastic games. RMCs and RMDPs provide natural abstract models of probabilistic procedural programs with recursion, and they are expressively equivalent to probabilistic and MDP extensions of pushdown automata. Moreover, a number of well-studied stochastic processes, including multi-type branching processes, (discrete-time) quasi-birth-death processes, and stochastic context-free grammars, can be suitably captured by subclasses of RMCs.
A central computational problem for analyzing various classes of recursive probabilistic systems is the computation of their (optimal) termination probabilities. These form a key ingredient for many other analyses, including model checking. For RMCs, and for important subclasses of RMDPs and RSGs, computing their termination values is equivalent to computing the least fixed point (LFP) solution of a corresponding monotone system of polynomial (min/max) equations. The complexity of computing the LFP solution for such equation systems is a intriguing problem, with connections to several areas of research. The LFP solution may in general be irrational. So, one possible aim is to compute it to within a desired additive error epsilon > 0. For general RMCs, approximating their termination probability within any non-trivial constant additive error < 1/2, is at least as hard as long-standing open problems in the complexity of numerical computation which are not even known to be in NP. For several key subclasses of RMCs and RMDPs, computing their termination values
turns out to be much more tractable.
In this talk I will survey algorithms for, and discuss the computational complexity of, key analysis problems for classes of infinite-state recursive MCs, MDPs, and stochastic games. In particular, I will discuss recent joint work with Alistair Stewart and Mihalis Yannakakis (in papers that appeared at STOC\u2712 and ICALP\u2712), in which we have obtained polynomial time algorithms for computing, to within arbitrary desired precision, the LFP solution of probabilistic polynomial (min/max) systems of equations. Using this, we obtained the first P-time algorithms for computing (within desired precision) the extinction probabilities of multi-type branching processes, the probability that an arbitrary given stochastic context-free grammar generates a given string, and the optimum (maximum or minimum) extinction probabilities for branching MDPs and context-free MDPs. For branching MDPs, their corresponding equations amount to Bellman optimality equations for minimizing/maximizing their termination probabilities. Our algorithms combine variations and generalizations of Newton\u27s method with other techniques, including linear programming. The algorithms are fairly easy to implement, but analyzing their worst-case running time
is mathematically quite involved
Polynomial Time Algorithms for Multi-Type Branching Processes and Stochastic Context-Free Grammars
We show that one can approximate the least fixed point solution for a
multivariate system of monotone probabilistic polynomial equations in time
polynomial in both the encoding size of the system of equations and in
log(1/\epsilon), where \epsilon > 0 is the desired additive error bound of the
solution. (The model of computation is the standard Turing machine model.)
We use this result to resolve several open problems regarding the
computational complexity of computing key quantities associated with some
classic and heavily studied stochastic processes, including multi-type
branching processes and stochastic context-free grammars
Value Iteration for Simple Stochastic Games: Stopping Criterion and Learning Algorithm
Simple stochastic games can be solved by value iteration (VI), which yields a
sequence of under-approximations of the value of the game. This sequence is
guaranteed to converge to the value only in the limit. Since no stopping
criterion is known, this technique does not provide any guarantees on its
results. We provide the first stopping criterion for VI on simple stochastic
games. It is achieved by additionally computing a convergent sequence of
over-approximations of the value, relying on an analysis of the game graph.
Consequently, VI becomes an anytime algorithm returning the approximation of
the value and the current error bound. As another consequence, we can provide a
simulation-based asynchronous VI algorithm, which yields the same guarantees,
but without necessarily exploring the whole game graph.Comment: CAV201
Reachability for Branching Concurrent Stochastic Games
We give polynomial time algorithms for deciding almost-sure and limit-sure reachability in Branching Concurrent Stochastic Games (BCSGs). These are a class of infinite-state imperfect-information stochastic games that generalize both finite-state concurrent stochastic reachability games ([L. de Alfaro et al., 2007]) and branching simple stochastic reachability games ([K. Etessami et al., 2018])
Reachability analysis of branching probabilistic processes
We study a fundamental class of infinite-state stochastic processes and stochastic
games, namely Branching Processes, under the properties of (single-target) reachability
and multi-objective reachability.
In particular, we study Branching Concurrent Stochastic Games (BCSGs), which
are an imperfect-information game extension to the classical Branching Processes, and
show that these games are determined, i.e., have a value, under the fundamental objective
of reachability, building on and generalizing prior work on Branching Simple
Stochastic Games and finite-state Concurrent Stochastic Games. We show that, unlike
in the turn-based branching games, in the concurrent setting the almost-sure and limitsure
reachability problems do not coincide and we give polynomial time algorithms
for deciding both almost-sure and limit-sure reachability. We also provide a discussion
on the complexity of quantitative reachability questions for BCSGs.
Furthermore, we introduce a new model, namely Ordered Branching Processes
(OBPs), which is a hybrid model between classical Branching Processes and Stochastic
Context-Free Grammars. Under the reachability objective, this model is equivalent
to the classical Branching Processes. We study qualitative multi-objective reachability
questions for Ordered Branching Markov Decision Processes (OBMDPs), or equivalently
context-free MDPs with simultaneous derivation. We provide algorithmic results
for efficiently checking certain Boolean combinations of qualitative reachability
and non-reachability queries with respect to different given target non-terminals.
Among the more interesting multi-objective reachability results, we provide two
separate algorithms for almost-sure and limit-sure multi-target reachability for OBMDPs.
Specifically, given an OBMDP, given a starting non-terminal, and given a set
of target non-terminals, our first algorithm decides whether the supremum probability,
of generating a tree that contains every target non-terminal in the set, is 1. Our second
algorithm decides whether there is a strategy for the player to almost-surely (with
probability 1) generate a tree that contains every target non-terminal in the set. The
two separate algorithms are needed: we show that indeed, in this context, almost-sure
and limit-sure multi-target reachability do not coincide. Both algorithms run in time
polynomial in the size of the OBMDP and exponential in the number of targets. Hence,
they run in polynomial time when the number of targets is fixed. The algorithms are
fixed-parameter tractable with respect to this number. Moreover, we show that the
qualitative almost-sure (and limit-sure) multi-target reachability decision problem is in
general NP-hard, when the size of the set of target non-terminals is not fixed
- …