Search CORE

478 research outputs found

Risk-sensitive Inverse Reinforcement Learning via Semi- and Non-Parametric Methods

Author: Lacotte Jonathan
Majumdar Anirudha
Pavone Marco
Singh Sumeet
Publication venue
Publication date: 01/01/2018
Field of study

The literature on Inverse Reinforcement Learning (IRL) typically assumes that humans take actions in order to minimize the expected value of a cost function, i.e., that humans are risk neutral. Yet, in practice, humans are often far from being risk neutral. To fill this gap, the objective of this paper is to devise a framework for risk-sensitive IRL in order to explicitly account for a human's risk sensitivity. To this end, we propose a flexible class of models based on coherent risk measures, which allow us to capture an entire spectrum of risk preferences from risk-neutral to worst-case. We propose efficient non-parametric algorithms based on linear programming and semi-parametric algorithms based on maximum likelihood for inferring a human's underlying risk measure and cost function for a rich class of static and dynamic decision-making settings. The resulting approach is demonstrated on a simulated driving game with ten human participants. Our method is able to infer and mimic a wide range of qualitatively different driving styles from highly risk-averse to risk-neutral in a data-efficient manner. Moreover, comparisons of the Risk-Sensitive (RS) IRL approach with a risk-neutral model show that the RS-IRL framework more accurately captures observed participant behavior both qualitatively and quantitatively, especially in scenarios where catastrophic outcomes such as collisions can occur.Comment: Submitted to International Journal of Robotics Research; Revision 1: (i) Clarified minor technical points; (ii) Revised proof for Theorem 3 to hold under weaker assumptions; (iii) Added additional figures and expanded discussions to improve readabilit

arXiv.org e-Print Archive

Princeton University Open Access Repository

An inexact Newton-Krylov algorithm for constrained diffeomorphic image registration

Author: Biros George
Mang Andreas
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 07/05/2015
Field of study

We propose numerical algorithms for solving large deformation diffeomorphic image registration problems. We formulate the nonrigid image registration problem as a problem of optimal control. This leads to an infinite-dimensional partial differential equation (PDE) constrained optimization problem. The PDE constraint consists, in its simplest form, of a hyperbolic transport equation for the evolution of the image intensity. The control variable is the velocity field. Tikhonov regularization on the control ensures well-posedness. We consider standard smoothness regularization based on

H^1

- or

H^2

-seminorms. We augment this regularization scheme with a constraint on the divergence of the velocity field rendering the deformation incompressible and thus ensuring that the determinant of the deformation gradient is equal to one, up to the numerical error. We use a Fourier pseudospectral discretization in space and a Chebyshev pseudospectral discretization in time. We use a preconditioned, globalized, matrix-free, inexact Newton-Krylov method for numerical optimization. A parameter continuation is designed to estimate an optimal regularization parameter. Regularity is ensured by controlling the geometric properties of the deformation field. Overall, we arrive at a black-box solver. We study spectral properties of the Hessian, grid convergence, numerical accuracy, computational efficiency, and deformation regularity of our scheme. We compare the designed Newton-Krylov methods with a globalized preconditioned gradient descent. We study the influence of a varying number of unknowns in time. The reported results demonstrate excellent numerical accuracy, guaranteed local deformation regularity, and computational efficiency with an optional control on local mass conservation. The Newton-Krylov methods clearly outperform the Picard method if high accuracy of the inversion is required.Comment: 32 pages; 10 figures; 9 table

arXiv.org e-Print Archive

CiteSeerX