105 research outputs found
Set-based value operators for non-stationary Markovian environments
This paper analyzes finite state Markov Decision Processes (MDPs) with
uncertain parameters in compact sets and re-examines results from robust MDP
via set-based fixed point theory. To this end, we generalize the Bellman and
policy evaluation operators to contracting operators on the value function
space and denote them as \emph{value operators}. We lift these value operators
to act on \emph{sets} of value functions and denote them as \emph{set-based
value operators}. We prove that the set-based value operators are
\emph{contractions} in the space of compact value function sets. Leveraging
insights from set theory, we generalize the rectangularity condition in classic
robust MDP literature to a containment condition for all value operators, which
is weaker and can be applied to a larger set of parameter-uncertain MDPs and
contracting operators in dynamic programming. We prove that both the
rectangularity condition and the containment condition sufficiently ensure that
the set-based value operator's fixed point set contains its own extrema
elements. For convex and compact sets of uncertain MDP parameters, we show
equivalence between the classic robust value function and the supremum of the
fixed point set of the set-based Bellman operator. Under dynamically changing
MDP parameters in compact sets, we prove a set convergence result for value
iteration, which otherwise may not converge to a single value function.
Finally, we derive novel guarantees for probabilistic path-planning problems in
planet exploration and stratospheric station-keeping.Comment: 17 pages, 11 figures, 1 tabl
Reducing Collision Risk in Multi-Agent Path Planning: Application to Air traffic Management
To minimize collision risks in the multi-agent path planning problem with
stochastic transition dynamics, we formulate a Markov decision process
congestion game with a multi-linear congestion cost. Players within the game
complete individual tasks while minimizing their own collision risks. We show
that the set of Nash equilibria coincides with the first-order KKT points of a
non-convex optimization problem. Our game is applied to a historical flight
plan over France to reduce collision risks between commercial aircraft.Comment: 6 pages, 2 figure
Blamelessly Optimal Control For Polytopic Safety Sets
In many safety-critical optimal control problems, users may request multiple
safety constraints that are jointly infeasible due to external factors such as
subsystem failures, unexpected disturbances, or fuel limitations. In this
manuscript, we introduce the concept of blameless optimality to characterize
control actions that a) satisfy the highest prioritized and feasible safety
constraints and b) remain optimal with respect to a mission objective. For a
general optimal control problem with jointly infeasible safety constraints, we
prove that a single optimization problem cannot find a blamelessly optimal
controller. Instead, finding blamelessly optimal control actions requires
sequentially solving at least two optimal control problems: one to determine
the highest priority level of constraints that is feasible and another to
determine the optimal control action with respect to these constraints. We
apply our results to a helicopter emergency landing scenario in which violating
at least one safety-induced landing constraint is unavoidable. Leveraging the
concept of blameless optimality, we formulate blamelessly optimal controllers
that can autonomously prioritize human safety over property integrity
- …