9,132 research outputs found
Robust Control of Uncertain Markov Decision Processes with Temporal Logic Specifications
We present a method for designing robust controllers for dynamical systems with linear temporal logic specifications. We abstract the original system by a finite Markov Decision Process (MDP) that has transition probabilities in a specified uncertainty set. A robust control policy for the MDP is generated that maximizes the worst-case probability of satisfying the specification over all transition probabilities in the uncertainty set. To do this, we use a procedure from probabilistic model checking to combine the system model with an automaton representing the specification. This new MDP is then transformed into an equivalent form that satisfies assumptions for stochastic shortest path dynamic programming. A robust version of dynamic programming allows us to solve for a -suboptimal robust control policy with time complexity times that for the non-robust case. We then implement this control policy on the original dynamical system
Multi-Objective Approaches to Markov Decision Processes with Uncertain Transition Parameters
Markov decision processes (MDPs) are a popular model for performance analysis
and optimization of stochastic systems. The parameters of stochastic behavior
of MDPs are estimates from empirical observations of a system; their values are
not known precisely. Different types of MDPs with uncertain, imprecise or
bounded transition rates or probabilities and rewards exist in the literature.
Commonly, analysis of models with uncertainties amounts to searching for the
most robust policy which means that the goal is to generate a policy with the
greatest lower bound on performance (or, symmetrically, the lowest upper bound
on costs). However, hedging against an unlikely worst case may lead to losses
in other situations. In general, one is interested in policies that behave well
in all situations which results in a multi-objective view on decision making.
In this paper, we consider policies for the expected discounted reward
measure of MDPs with uncertain parameters. In particular, the approach is
defined for bounded-parameter MDPs (BMDPs) [8]. In this setting the worst, best
and average case performances of a policy are analyzed simultaneously, which
yields a multi-scenario multi-objective optimization problem. The paper
presents and evaluates approaches to compute the pure Pareto optimal policies
in the value vector space.Comment: 9 pages, 5 figures, preprint for VALUETOOLS 201
Probabilistic Bisimulations for PCTL Model Checking of Interval MDPs
Verification of PCTL properties of MDPs with convex uncertainties has been
investigated recently by Puggelli et al. However, model checking algorithms
typically suffer from state space explosion. In this paper, we address
probabilistic bisimulation to reduce the size of such an MDPs while preserving
PCTL properties it satisfies. We discuss different interpretations of
uncertainty in the models which are studied in the literature and that result
in two different definitions of bisimulations. We give algorithms to compute
the quotients of these bisimulations in time polynomial in the size of the
model and exponential in the uncertain branching. Finally, we show by a case
study that large models in practice can have small branching and that a
substantial state space reduction can be achieved by our approach.Comment: In Proceedings SynCoP 2014, arXiv:1403.784
On stabilization of bilinear uncertain time-delay stochastic systems with Markovian jumping parameters
Copyright [2002] IEEE. This material is posted here with permission of the IEEE. Such permission of the IEEE does not in any way imply IEEE endorsement of any of Brunel University's products or services. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to [email protected]. By choosing to view this document, you agree to all provisions of the copyright laws protecting it.In this paper, we investigate the stochastic stabilization problem for a class of bilinear continuous time-delay uncertain systems with Markovian jumping parameters. Specifically, the stochastic bilinear jump system under study involves unknown state time-delay, parameter uncertainties, and unknown nonlinear deterministic disturbances. The jumping parameters considered here form a continuous-time discrete-state homogeneous Markov process. The whole system may be regarded as a stochastic bilinear hybrid system that includes both time-evolving and event-driven mechanisms. Our attention is focused on the design of a robust state-feedback controller such that, for all admissible uncertainties as well as nonlinear disturbances, the closed-loop system is stochastically exponentially stable in the mean square, independent of the time delay. Sufficient conditions are established to guarantee the existence of desired robust controllers, which are given in terms of the solutions to a set of either linear matrix inequalities (LMIs), or coupled quadratic matrix inequalities. The developed theory is illustrated by numerical simulatio
Distributionally Robust Optimization for Sequential Decision Making
The distributionally robust Markov Decision Process (MDP) approach asks for a
distributionally robust policy that achieves the maximal expected total reward
under the most adversarial distribution of uncertain parameters. In this paper,
we study distributionally robust MDPs where ambiguity sets for the uncertain
parameters are of a format that can easily incorporate in its description the
uncertainty's generalized moment as well as statistical distance information.
In this way, we generalize existing works on distributionally robust MDP with
generalized-moment-based and statistical-distance-based ambiguity sets to
incorporate information from the former class such as moments and dispersions
to the latter class that critically depends on empirical observations of the
uncertain parameters. We show that, under this format of ambiguity sets, the
resulting distributionally robust MDP remains tractable under mild technical
conditions. To be more specific, a distributionally robust policy can be
constructed by solving a sequence of one-stage convex optimization subproblems
- ā¦