13,917 research outputs found
Risk-sensitive Inverse Reinforcement Learning via Semi- and Non-Parametric Methods
The literature on Inverse Reinforcement Learning (IRL) typically assumes that
humans take actions in order to minimize the expected value of a cost function,
i.e., that humans are risk neutral. Yet, in practice, humans are often far from
being risk neutral. To fill this gap, the objective of this paper is to devise
a framework for risk-sensitive IRL in order to explicitly account for a human's
risk sensitivity. To this end, we propose a flexible class of models based on
coherent risk measures, which allow us to capture an entire spectrum of risk
preferences from risk-neutral to worst-case. We propose efficient
non-parametric algorithms based on linear programming and semi-parametric
algorithms based on maximum likelihood for inferring a human's underlying risk
measure and cost function for a rich class of static and dynamic
decision-making settings. The resulting approach is demonstrated on a simulated
driving game with ten human participants. Our method is able to infer and mimic
a wide range of qualitatively different driving styles from highly risk-averse
to risk-neutral in a data-efficient manner. Moreover, comparisons of the
Risk-Sensitive (RS) IRL approach with a risk-neutral model show that the RS-IRL
framework more accurately captures observed participant behavior both
qualitatively and quantitatively, especially in scenarios where catastrophic
outcomes such as collisions can occur.Comment: Submitted to International Journal of Robotics Research; Revision 1:
(i) Clarified minor technical points; (ii) Revised proof for Theorem 3 to
hold under weaker assumptions; (iii) Added additional figures and expanded
discussions to improve readabilit
Evolution of wealth in a nonconservative economy driven by local Nash equilibria
We develop a model for the evolution of wealth in a non-conservative economic
environment, extending a theory developed earlier by the authors. The model
considers a system of rational agents interacting in a game theoretical
framework. This evolution drives the dynamic of the agents in both wealth and
economic configuration variables. The cost function is chosen to represent a
risk averse strategy of each agent. That is, the agent is more likely to
interact with the market, the more predictable the market, and therefore the
smaller its individual risk. This yields a kinetic equation for an effective
single particle agent density with a Nash equilibrium serving as the local
thermodynamic equilibrium. We consider a regime of scale separation where the
large scale dynamics is given by a hydrodynamic closure with this local
equilibrium. A class of generalized collision invariants (GCIs) is developed to
overcome the difficulty of the non-conservative property in the hydrodynamic
closure derivation of the large scale dynamics for the evolution of wealth
distribution. The result is a system of gas dynamics-type equations for the
density and average wealth of the agents on large scales. We recover the
inverse Gamma distribution, which has been previously considered in the
literature, as a local equilibrium for particular choices of the cost function
Behavior Identification and Prediction for a Probabilistic Risk Framework
Operation in a real world traffic requires autonomous vehicles to be able to
plan their motion in complex environments (multiple moving participants).
Planning through such environment requires the right search space to be
provided for the trajectory or maneuver planners so that the safest motion for
the ego vehicle can be identified. Given the current states of the environment
and its participants, analyzing the risks based on the predicted trajectories
of all the traffic participants provides the necessary search space for the
planning of motion. This paper provides a fresh taxonomy of safety / risks that
an autonomous vehicle should be able to handle while navigating through
traffic. It provides a reference system architecture that needs to be
implemented as well as describes a novel way of identifying and predicting the
behaviors of the traffic participants using classic Multi Model Adaptive
Estimation (MMAE). Preliminary simulation results of the implemented model are
included
Sparse Interacting Gaussian Processes: Efficiency and Optimality Theorems of Autonomous Crowd Navigation
We study the sparsity and optimality properties of crowd navigation and find
that existing techniques do not satisfy both criteria simultaneously: either
they achieve optimality with a prohibitive number of samples or tractability
assumptions make them fragile to catastrophe. For example, if the human and
robot are modeled independently, then tractability is attained but the planner
is prone to overcautious or overaggressive behavior. For sampling based motion
planning of joint human-robot cost functions, for agents and step
lookahead, samples are needed for coverage of the
action space. Advanced approaches statically partition the action space into
free-space and then sample in those convex regions. However, if the human is
\emph{moving} into free-space, then the partition is misleading and sampling is
unsafe: free space will soon be occupied. We diagnose the cause of these
deficiencies---optimization happens over \emph{trajectory} space---and propose
a novel solution: optimize over trajectory \emph{distribution} space by using a
Gaussian process (GP) basis. We exploit the "kernel trick" of GPs, where a
continuum of trajectories are captured with a mean and covariance function. By
using the mean and covariance as proxies for a trajectory family we reason
about collective trajectory behavior without resorting to sampling. The GP
basis is sparse and optimal with respect to collision avoidance and robot and
crowd intention and flexibility. GP sparsity leans heavily on the insight that
joint action space decomposes into free regions; however, the decomposition
contains feasible solutions only if the partition is dynamically generated. We
call our approach \emph{-sparse interacting Gaussian
processes}
SLAP: Simultaneous Localization and Planning Under Uncertainty for Physical Mobile Robots via Dynamic Replanning in Belief Space: Extended version
Simultaneous localization and Planning (SLAP) is a crucial ability for an
autonomous robot operating under uncertainty. In its most general form, SLAP
induces a continuous POMDP (partially-observable Markov decision process),
which needs to be repeatedly solved online. This paper addresses this problem
and proposes a dynamic replanning scheme in belief space. The underlying POMDP,
which is continuous in state, action, and observation space, is approximated
offline via sampling-based methods, but operates in a replanning loop online to
admit local improvements to the coarse offline policy. This construct enables
the proposed method to combat changing environments and large localization
errors, even when the change alters the homotopy class of the optimal
trajectory. It further outperforms the state-of-the-art FIRM (Feedback-based
Information RoadMap) method by eliminating unnecessary stabilization steps.
Applying belief space planning to physical systems brings with it a plethora of
challenges. A key focus of this paper is to implement the proposed planner on a
physical robot and show the SLAP solution performance under uncertainty, in
changing environments and in the presence of large disturbances, such as a
kidnapped robot situation.Comment: 20 pages, updated figures, extended theory and simulation result
Safe Reinforcement Learning with Scene Decomposition for Navigating Complex Urban Environments
Navigating urban environments represents a complex task for automated
vehicles. They must reach their goal safely and efficiently while considering a
multitude of traffic participants. We propose a modular decision making
algorithm to autonomously navigate intersections, addressing challenges of
existing rule-based and reinforcement learning (RL) approaches. We first
present a safe RL algorithm relying on a model-checker to ensure safety
guarantees. To make the decision strategy robust to perception errors and
occlusions, we introduce a belief update technique using a learning based
approach. Finally, we use a scene decomposition approach to scale our algorithm
to environments with multiple traffic participants. We empirically demonstrate
that our algorithm outperforms rule-based methods and reinforcement learning
techniques on a complex intersection scenario.Comment: 8 pages; 7 figure
A Method for Estimating the Probability of Extremely Rare Accidents in Complex Systems
Estimating the probability of failures or accidents with aerospace systems is
often necessary when new concepts or designs are introduced, as it is being
done for Autonomous Aircraft. If the design is safe, as it is supposed to be,
accident cases are hard to find. Such analysis needs some variance reduction
technique and several algorithms exist for that, however specific model
features may cause difficulties in practice, such as the case of system models
where independent agents have to autonomously accomplish missions within finite
time, and likely with the presence of human agents. For handling these
scenarios, this paper presents a novel estimation approach, based on the
combination of the well-established variation reduction technique of
Interacting Particles System (IPS) with the long-standing optimization
algorithm denominated DIviding RECTangles (DIRECT). When combined, these two
techniques yield statistically significant results for extremely low
probabilities. In addition, this novel approach allows the identification of
intermediate events and simplifies the evaluation of sensitivity of the
estimated probabilities to certain system parameters
A-Evac: the evacuation simulator for stochastic environment
We introduce an open-source software Aamks for fire risk assessment. This
article focuses on a component of Aamks - an evacuation simulator named a-evac.
A-evac models evacuation of humans in the fire environment produced by CFAST
fire simulator. In the article we discuss the probabilistic evacuation
approach, automatic planning of exit routes, the interactions amongst the
moving evacuees and the impact of smoke on the humans. The results consist of
risk values based on FED, F-N curves and evacuation animations.Comment: Source code of the software described in this article can be found at
http://github.com/aamk
Generating Comfortable, Safe and Comprehensible Trajectories for Automated Vehicles in Mixed Traffic
While motion planning approaches for automated driving often focus on safety
and mathematical optimality with respect to technical parameters, they barely
consider convenience, perceived safety for the passenger and comprehensibility
for other traffic participants. For automated driving in mixed traffic,
however, this is key to reach public acceptance. In this paper, we revise the
problem statement of motion planning in mixed traffic: Instead of largely
simplifying the motion planning problem to a convex optimization problem, we
keep a more complex probabilistic multi agent model and strive for a near
optimal solution. We assume cooperation of other traffic participants, yet
being aware of violations of this assumption. This approach yields solutions
that are provably safe in all situations, and convenient and comprehensible in
situations that are also unambiguous for humans. Thus, it outperforms existing
approaches in mixed traffic scenarios, as we show in simulation
Optimal Alarms for Vehicular Collision Detection
An important application of intelligent vehicles is advance detection of
dangerous events such as collisions. This problem is framed as a problem of
optimal alarm choice given predictive models for vehicle location and motion.
Techniques for real-time collision detection are surveyed and grouped into
three classes: random Monte Carlo sampling, faster deterministic
approximations, and machine learning models trained by simulation. Theoretical
guarantees on the performance of these collision detection techniques are
provided where possible, and empirical analysis is provided for two example
scenarios. Results validate Monte Carlo sampling as a robust solution despite
its simplicity
- …