Search CORE

26 research outputs found

Compatible Reward Inverse Reinforcement Learning

Author: Metelli ALBERTO MARIA
Pirotta M.
Restelli M.
Publication venue
Publication date: 01/01/2017
Field of study

Archivio istituzionale della ricerca - Politecnico di Milano

Online inverse reinforcement learning with unknown disturbances

Author: ioannou
levine
levine
neu
ng
self
syed
ziebart
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/03/2020
Field of study

This paper addresses the problem of online inverse reinforcement learning for nonlinear systems with modeling uncertainties while in the presence of unknown disturbances. The developed approach observes state and input trajectories for an agent and identifies the unknown reward function online. Sub-optimality introduced in the observed trajectories by the unknown external disturbance is compensated for using a novel model-based inverse reinforcement learning approach. The observer estimates the external disturbances and uses the resulting estimates to learn the dynamic model of the demonstrator. The learned demonstrator model along with the observed suboptimal trajectories are used to implement inverse reinforcement learning. Theoretical guarantees are provided using Lyapunov theory and a simulation example is shown to demonstrate the effectiveness of the proposed technique.Comment: 8 pages, 3 figure

arXiv.org e-Print Archive

Crossref

Online Observer-Based Inverse Reinforcement Learning

Author: Bai He
Coleman Kevin
Kamalapurkar Rushikesh
Self Ryan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/11/2020
Field of study

In this paper, a novel approach to the output-feedback inverse reinforcement learning (IRL) problem is developed by casting the IRL problem, for linear systems with quadratic cost functions, as a state estimation problem. Two observer-based techniques for IRL are developed, including a novel observer method that re-uses previous state estimates via history stacks. Theoretical guarantees for convergence and robustness are established under appropriate excitation conditions. Simulations demonstrate the performance of the developed observers and filters under noisy and noise-free measurements.Comment: 7 pages, 3 figure

arXiv.org e-Print Archive

Compatible Reward Inverse Reinforcement Learning

Author: Metelli Alberto,
Pirotta Matteo
Restelli Marcello
Publication venue: HAL CCSD
Publication date: 04/12/2017
Field of study

International audienceInverse Reinforcement Learning (IRL) is an effective approach to recover a reward function that explains the behavior of an expert by observing a set of demonstrations. This paper is about a novel model-free IRL approach that, differently from most of the existing IRL algorithms, does not require to specify a function space where to search for the expert's reward function. Leveraging on the fact that the policy gradient needs to be zero for any optimal policy, the algorithm generates a set of basis functions that span the subspace of reward functions that make the policy gradient vanish. Within this subspace, using a second-order criterion, we search for the reward function that penalizes the most a deviation from the expert's policy. After introducing our approach for finite domains, we extend it to continuous ones. The proposed approach is empirically compared to other IRL methods both in the (finite) Taxi domain and in the (continuous) Linear Quadratic Gaussian (LQG) and Car on the Hill environments

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Kernel Density Bayesian Inverse Reinforcement Learning

Author: Cai Diana
Engelhardt Barbara E.
Jones Andrew
Li Didong
Mandyam Aishwarya
Publication venue
Publication date: 12/10/2023
Field of study

Inverse reinforcement learning~(IRL) is a powerful framework to infer an agent's reward function by observing its behavior, but IRL algorithms that learn point estimates of the reward function can be misleading because there may be several functions that describe an agent's behavior equally well. A Bayesian approach to IRL models a distribution over candidate reward functions, alleviating the shortcomings of learning a point estimate. However, several Bayesian IRL algorithms use a

Q

-value function in place of the likelihood function. The resulting posterior is computationally intensive to calculate, has few theoretical guarantees, and the

Q

-value function is often a poor approximation for the likelihood. We introduce kernel density Bayesian IRL (KD-BIRL), which uses conditional kernel density estimation to directly approximate the likelihood, providing an efficient framework that, with a modified reward function parameterization, is applicable to environments with complex and infinite state spaces. We demonstrate KD-BIRL's benefits through a series of experiments in Gridworld environments and a simulated sepsis treatment task

arXiv.org e-Print Archive

Inverse KKT – Learning Cost Functions of Manipulation Tasks from Demonstrations

Author: Englert Peter
Toussaint Marc
Vien Ngo Anh
Publication venue
Publication date: 01/12/2017
Field of study

Queen's University Belfast Research Portal