105 research outputs found

    Set-based value operators for non-stationary Markovian environments

    Full text link
    This paper analyzes finite state Markov Decision Processes (MDPs) with uncertain parameters in compact sets and re-examines results from robust MDP via set-based fixed point theory. To this end, we generalize the Bellman and policy evaluation operators to contracting operators on the value function space and denote them as \emph{value operators}. We lift these value operators to act on \emph{sets} of value functions and denote them as \emph{set-based value operators}. We prove that the set-based value operators are \emph{contractions} in the space of compact value function sets. Leveraging insights from set theory, we generalize the rectangularity condition in classic robust MDP literature to a containment condition for all value operators, which is weaker and can be applied to a larger set of parameter-uncertain MDPs and contracting operators in dynamic programming. We prove that both the rectangularity condition and the containment condition sufficiently ensure that the set-based value operator's fixed point set contains its own extrema elements. For convex and compact sets of uncertain MDP parameters, we show equivalence between the classic robust value function and the supremum of the fixed point set of the set-based Bellman operator. Under dynamically changing MDP parameters in compact sets, we prove a set convergence result for value iteration, which otherwise may not converge to a single value function. Finally, we derive novel guarantees for probabilistic path-planning problems in planet exploration and stratospheric station-keeping.Comment: 17 pages, 11 figures, 1 tabl

    Reducing Collision Risk in Multi-Agent Path Planning: Application to Air traffic Management

    Full text link
    To minimize collision risks in the multi-agent path planning problem with stochastic transition dynamics, we formulate a Markov decision process congestion game with a multi-linear congestion cost. Players within the game complete individual tasks while minimizing their own collision risks. We show that the set of Nash equilibria coincides with the first-order KKT points of a non-convex optimization problem. Our game is applied to a historical flight plan over France to reduce collision risks between commercial aircraft.Comment: 6 pages, 2 figure

    Blamelessly Optimal Control For Polytopic Safety Sets

    Full text link
    In many safety-critical optimal control problems, users may request multiple safety constraints that are jointly infeasible due to external factors such as subsystem failures, unexpected disturbances, or fuel limitations. In this manuscript, we introduce the concept of blameless optimality to characterize control actions that a) satisfy the highest prioritized and feasible safety constraints and b) remain optimal with respect to a mission objective. For a general optimal control problem with jointly infeasible safety constraints, we prove that a single optimization problem cannot find a blamelessly optimal controller. Instead, finding blamelessly optimal control actions requires sequentially solving at least two optimal control problems: one to determine the highest priority level of constraints that is feasible and another to determine the optimal control action with respect to these constraints. We apply our results to a helicopter emergency landing scenario in which violating at least one safety-induced landing constraint is unavoidable. Leveraging the concept of blameless optimality, we formulate blamelessly optimal controllers that can autonomously prioritize human safety over property integrity
    corecore