Search CORE

29 research outputs found

Inverse Reinforcement Learning in Swarm Systems

Author: KhudaBukhsh Wasiur R.
Koeppl Heinz
Zoubir Abdelhak M.
Šošić Adrian
Publication venue
Publication date: 01/01/2017
Field of study

Inverse reinforcement learning (IRL) has become a useful tool for learning behavioral models from demonstration data. However, IRL remains mostly unexplored for multi-agent systems. In this paper, we show how the principle of IRL can be extended to homogeneous large-scale problems, inspired by the collective swarming behavior of natural systems. In particular, we make the following contributions to the field: 1) We introduce the swarMDP framework, a sub-class of decentralized partially observable Markov decision processes endowed with a swarm characterization. 2) Exploiting the inherent homogeneity of this framework, we reduce the resulting multi-agent IRL problem to a single-agent one by proving that the agent-specific value functions in this model coincide. 3) To solve the corresponding control problem, we propose a novel heterogeneous learning scheme that is particularly tailored to the swarm setting. Results on two example systems demonstrate that our framework is able to produce meaningful local reward models from which we can replicate the observed global system dynamics.Comment: 9 pages, 8 figures; ### Version 2 ### version accepted at AAMAS 201

arXiv.org e-Print Archive

TUbiblio

Efficient Probabilistic Performance Bounds for Inverse Reinforcement Learning

Author: Brown Daniel S.
Niekum Scott
Publication venue
Publication date: 29/04/2018
Field of study

In the field of reinforcement learning there has been recent progress towards safety and high-confidence bounds on policy performance. However, to our knowledge, no practical methods exist for determining high-confidence policy performance bounds in the inverse reinforcement learning setting---where the true reward function is unknown and only samples of expert behavior are given. We propose a sampling method based on Bayesian inverse reinforcement learning that uses demonstrations to determine practical high-confidence upper bounds on the

\alpha

-worst-case difference in expected return between any evaluation policy and the optimal policy under the expert's unknown reward function. We evaluate our proposed bound on both a standard grid navigation task and a simulated driving task and achieve tighter and more accurate bounds than a feature count-based baseline. We also give examples of how our proposed bound can be utilized to perform risk-aware policy selection and risk-aware policy improvement. Because our proposed bound requires several orders of magnitude fewer demonstrations than existing high-confidence bounds, it is the first practical method that allows agents that learn from demonstration to express confidence in the quality of their learned policy.Comment: In proceedings AAAI-1

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

ICRA Roboethics Challenge 2023: Intelligent Disobedience in an Elderly Care Home

Author: Briggs Gordon
Mirsky Reuth
Paster Sveta
Rogers Kantwon
Stone Peter
Publication venue
Publication date: 15/11/2023
Field of study

With the projected surge in the elderly population, service robots offer a promising avenue to enhance their well-being in elderly care homes. Such robots will encounter complex scenarios which will require them to perform decisions with ethical consequences. In this report, we propose to leverage the Intelligent Disobedience framework in order to give the robot the ability to perform a deliberation process over decisions with potential ethical implications. We list the issues that this framework can assist with, define it formally in the context of the specific elderly care home scenario, and delineate the requirements for implementing an intelligently disobeying robot. We conclude this report with some critical analysis and suggestions for future work.Comment: This report is part of ICRA roboethics competition : https://competition.raiselab.ca/competition-details-2023_1/ethics-challenge/submitted-proposals/submission-

arXiv.org e-Print Archive

Inverse Reinforcement Learning through Policy Gradient Minimization

Author: Pirotta Matteo
Restelli Marcello
Publication venue: AAAI Press
Publication date: 01/01/2016
Field of study

Inverse Reinforcement Learning (IRL) deals with the problem of recovering the reward function optimized by an expert given a set of demonstrations of the expert's policy.Most IRL algorithms need to repeatedly compute the optimal policy for different reward functions.This paper proposes a new IRL approach that allows to recover the reward function without the need of solving any "direct" RL problem.The idea is to find the reward function that minimizes the gradient of a parameterized representation of the expert's policy.In particular, when the reward function can be represented as a linear combination of some basis functions, we will show that the aforementioned optimization problem can be efficiently solved.We present an empirical evaluation of the proposed approach on a multidimensional version of the Linear-Quadratic Regulator (LQR) both in the case where the parameters of the expert's policy are known and in the (more realistic) case where the parameters of the expert's policy need to be inferred from the expert's demonstrations.Finally, the algorithm is compared against the state-of-the-art on the mountain car domain, where the expert's policy is unknown

Archivio istituzionale della ricerca - Politecnico di Milano

Association for the Advancement of Artificial Intelligence: AAAI Publications

A comparative study between motivated learning and reinforcement learning

Author: GRAHAM James T.
HE Haibo
NI Zhen
STARZYK Janusz A.
TAN Ah-Hwee
TENG T.-H.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/2015
Field of study

This paper analyzes advanced reinforcement learning techniques and compares some of them to motivated learning. Motivated learning is briefly discussed indicating its relation to reinforcement learning. A black box scenario for comparative analysis of learning efficiency in autonomous agents is developed and described. This is used to analyze selected algorithms. Reported results demonstrate that in the selected category of problems, motivated learning outperformed all reinforcement learning algorithms we compared with

Crossref

Institutional Knowledge at Singapore Management University

DigitalCommons@URI

Inverse KKT – Learning Cost Functions of Manipulation Tasks from Demonstrations

Author: Englert Peter
Toussaint Marc
Vien Ngo Anh
Publication venue
Publication date: 01/12/2017
Field of study

Queen's University Belfast Research Portal