29,243 research outputs found
Risk-sensitive Inverse Reinforcement Learning via Semi- and Non-Parametric Methods
The literature on Inverse Reinforcement Learning (IRL) typically assumes that
humans take actions in order to minimize the expected value of a cost function,
i.e., that humans are risk neutral. Yet, in practice, humans are often far from
being risk neutral. To fill this gap, the objective of this paper is to devise
a framework for risk-sensitive IRL in order to explicitly account for a human's
risk sensitivity. To this end, we propose a flexible class of models based on
coherent risk measures, which allow us to capture an entire spectrum of risk
preferences from risk-neutral to worst-case. We propose efficient
non-parametric algorithms based on linear programming and semi-parametric
algorithms based on maximum likelihood for inferring a human's underlying risk
measure and cost function for a rich class of static and dynamic
decision-making settings. The resulting approach is demonstrated on a simulated
driving game with ten human participants. Our method is able to infer and mimic
a wide range of qualitatively different driving styles from highly risk-averse
to risk-neutral in a data-efficient manner. Moreover, comparisons of the
Risk-Sensitive (RS) IRL approach with a risk-neutral model show that the RS-IRL
framework more accurately captures observed participant behavior both
qualitatively and quantitatively, especially in scenarios where catastrophic
outcomes such as collisions can occur.Comment: Submitted to International Journal of Robotics Research; Revision 1:
(i) Clarified minor technical points; (ii) Revised proof for Theorem 3 to
hold under weaker assumptions; (iii) Added additional figures and expanded
discussions to improve readabilit
Data-driven Inverse Optimization with Imperfect Information
In data-driven inverse optimization an observer aims to learn the preferences
of an agent who solves a parametric optimization problem depending on an
exogenous signal. Thus, the observer seeks the agent's objective function that
best explains a historical sequence of signals and corresponding optimal
actions. We focus here on situations where the observer has imperfect
information, that is, where the agent's true objective function is not
contained in the search space of candidate objectives, where the agent suffers
from bounded rationality or implementation errors, or where the observed
signal-response pairs are corrupted by measurement noise. We formalize this
inverse optimization problem as a distributionally robust program minimizing
the worst-case risk that the {\em predicted} decision ({\em i.e.}, the decision
implied by a particular candidate objective) differs from the agent's {\em
actual} response to a random signal. We show that our framework offers rigorous
out-of-sample guarantees for different loss functions used to measure
prediction errors and that the emerging inverse optimization problems can be
exactly reformulated as (or safely approximated by) tractable convex programs
when a new suboptimality loss function is used. We show through extensive
numerical tests that the proposed distributionally robust approach to inverse
optimization attains often better out-of-sample performance than the
state-of-the-art approaches
Deep Mean-Shift Priors for Image Restoration
In this paper we introduce a natural image prior that directly represents a
Gaussian-smoothed version of the natural image distribution. We include our
prior in a formulation of image restoration as a Bayes estimator that also
allows us to solve noise-blind image restoration problems. We show that the
gradient of our prior corresponds to the mean-shift vector on the natural image
distribution. In addition, we learn the mean-shift vector field using denoising
autoencoders, and use it in a gradient descent approach to perform Bayes risk
minimization. We demonstrate competitive results for noise-blind deblurring,
super-resolution, and demosaicing.Comment: NIPS 201
Inverse Optimization with Noisy Data
Inverse optimization refers to the inference of unknown parameters of an
optimization problem based on knowledge of its optimal solutions. This paper
considers inverse optimization in the setting where measurements of the optimal
solutions of a convex optimization problem are corrupted by noise. We first
provide a formulation for inverse optimization and prove it to be NP-hard. In
contrast to existing methods, we show that the parameter estimates produced by
our formulation are statistically consistent. Our approach involves combining a
new duality-based reformulation for bilevel programs with a regularization
scheme that smooths discontinuities in the formulation. Using epi-convergence
theory, we show the regularization parameter can be adjusted to approximate the
original inverse optimization problem to arbitrary accuracy, which we use to
prove our consistency results. Next, we propose two solution algorithms based
on our duality-based formulation. The first is an enumeration algorithm that is
applicable to settings where the dimensionality of the parameter space is
modest, and the second is a semiparametric approach that combines nonparametric
statistics with a modified version of our formulation. These numerical
algorithms are shown to maintain the statistical consistency of the underlying
formulation. Lastly, using both synthetic and real data, we demonstrate that
our approach performs competitively when compared with existing heuristics
- …