460 research outputs found
Scalable First-Order Methods for Robust MDPs
Robust Markov Decision Processes (MDPs) are a powerful framework for modeling
sequential decision-making problems with model uncertainty. This paper proposes
the first first-order framework for solving robust MDPs. Our algorithm
interleaves primal-dual first-order updates with approximate Value Iteration
updates. By carefully controlling the tradeoff between the accuracy and cost of
Value Iteration updates, we achieve an ergodic convergence rate of for the best
choice of parameters on ellipsoidal and Kullback-Leibler -rectangular
uncertainty sets, where and is the number of states and actions,
respectively. Our dependence on the number of states and actions is
significantly better (by a factor of ) than that of pure
Value Iteration algorithms. In numerical experiments on ellipsoidal uncertainty
sets we show that our algorithm is significantly more scalable than
state-of-the-art approaches. Our framework is also the first one to solve
robust MDPs with -rectangular KL uncertainty sets
Robust Markov Decision Processes
Markov decision processes (MDPs) are powerful tools for decision making in uncertain dynamic environments. However, the solutions of MDPs are of limited practical use due to their sensitivity to distributional model parameters, which are typically unknown and have to be estimated by the decision maker. To counter the detrimental effects of estimation errors, we consider robust MDPs that offer probabilistic guarantees in view of the unknown parameters. To this end, we assume that an observation history of the MDP is available. Based on this history, we derive a confidence region that contains the unknown parameters with a pre-specified probability 1-Ć. Afterwards, we determine a policy that attains the highest worst-case performance over this confidence region. By construction, this policy achieves or exceeds its worst-case performance with a confidence of at least 1 - Ć. Our method involves the solution of tractable conic programs of moderate size.
Probabilistic Bisimulations for PCTL Model Checking of Interval MDPs
Verification of PCTL properties of MDPs with convex uncertainties has been
investigated recently by Puggelli et al. However, model checking algorithms
typically suffer from state space explosion. In this paper, we address
probabilistic bisimulation to reduce the size of such an MDPs while preserving
PCTL properties it satisfies. We discuss different interpretations of
uncertainty in the models which are studied in the literature and that result
in two different definitions of bisimulations. We give algorithms to compute
the quotients of these bisimulations in time polynomial in the size of the
model and exponential in the uncertain branching. Finally, we show by a case
study that large models in practice can have small branching and that a
substantial state space reduction can be achieved by our approach.Comment: In Proceedings SynCoP 2014, arXiv:1403.784
Robust Control of Uncertain Markov Decision Processes with Temporal Logic Specifications
We present a method for designing robust controllers for dynamical systems with linear temporal logic specifications. We abstract the original system by a finite Markov Decision Process (MDP) that has transition probabilities in a specified uncertainty set. A robust control policy for the MDP is generated that maximizes the worst-case probability of satisfying the specification over all transition probabilities in the uncertainty set. To do this, we use a procedure from probabilistic model checking to combine the system model with an automaton representing the specification. This new MDP is then transformed into an equivalent form that satisfies assumptions for stochastic shortest path dynamic programming. A robust version of dynamic programming allows us to solve for a -suboptimal robust control policy with time complexity times that for the non-robust case. We then implement this control policy on the original dynamical system
- ā¦