Search CORE

13 research outputs found

Bayesian multitask inverse reinforcement learning

Author: C.A. Rothkopf
J. Choi
M.L. Puterman
T.S. Ferguson
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

We generalise the problem of inverse reinforcement learning to multiple tasks, from multiple demonstrations. Each one may represent one expert trying to solve a different task, or as different experts trying to solve the same task. Our main contribution is to formalise the problem as statistical preference elicitation, via a number of structured priors, whose form captures our biases about the relatedness of different tasks or expert policies. In doing so, we introduce a prior on policy optimality, which is more natural to specify. We show that our framework allows us not only to learn to efficiently from multiple experts but to also effectively differentiate between the goals of each. Possible applications include analysing the intrinsic motivations of subjects in behavioural experiments and learning from multiple teachers.Comment: Corrected version. 13 pages, 8 figure

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

CiteSeerX

Crossref

Chalmers Research

FIAS Scientific Report 2011

Author: Frankfurt Institute for Advanced Studies
Publication venue
Publication date: 01/01/2010
Field of study

In the year 2010 the Frankfurt Institute for Advanced Studies has successfully continued to follow its agenda to pursue theoretical research in the natural sciences. As stipulated in its charter, FIAS closely collaborates with extramural research institutions, like the Max Planck Institute for Brain Research in Frankfurt and the GSI Helmholtz Center for Heavy Ion Research, Darmstadt and with research groups at the science departments of Goethe University. The institute also engages in the training of young researchers and the education of doctoral students. This Annual Report documents how these goals have been pursued in the year 2010. Notable events in the scientific life of the Institute will be presented, e.g., teaching activities in the framework of the Frankfurt International Graduate School for Science (FIGSS), colloquium schedules, conferences organized by FIAS, and a full bibliography of publications by authors affiliated with FIAS. The main part of the Report consists of short one-page summaries describing the scientific progress reached in individual research projects in the year 2010..

Hochschulschriftenserver - Universität Frankfurt am Main

Final report key contents: main results accomplished by the EU-Funded project IM-CLeVeR - Intrinsically Motivated Cumulative Learning Versatile Robots

Author: Baldassarre Gianluca
Barto Andrew
Guglielmelli Eugenio
Gurney Kevin
Keller Flavio
Lee Mark
McGuinnity Martin
Mirolli Marco
Redgrave Peter
Schmidhuber Juergen
Triesch Jochen
Visalberghi Elisabetta
Publication venue
Publication date
Field of study

This document has the goal of presenting the main scientific and technological achievements of the project IM-CLeVeR. The document is organised as follows: 1. Project executive summary: a brief overview of the project vision, objectives and keywords. 2. Beneficiaries of the project and contacts: list of Teams (partners) of the project, Team Leaders and contacts. 3. Project context and objectives: the vision of the project and its overall objectives 4. Overview of work performed and main results achieved: a one page overview of the main results of the project 5. Overview of main results per partner: a bullet-point list of main results per partners 6. Main achievements in detail, per partner: a throughout explanation of the main results per partner (but including collaboration work), with also reference to the main publications supporting them

PUblication MAnagement

Novelty detection and learning drives

This document presents Deliverable 5.1 of the IM-CLeVeR (Intrinsically Motivated Cumulative Learning Versatile Robots) EU FP7 project. It represents one of two deliverables from Workpackage 5 (Novelty Detection and Drives for Autonomous Learning)

PUblication MAnagement

Detaching the strings:Practical algorithms for Learning from Demonstration

Author: Shiarlis K.C.
Publication venue
Publication date: 01/01/2019
Field of study

International Migration, Integration and Social Cohesion online publications

Policy Explanation and Model Refinement in Decision-Theoretic Planning

Author: Khan Omar Zia
Publication venue: 'University of Waterloo'
Publication date: 01/01/2013
Field of study

Decision-theoretic systems, such as Markov Decision Processes (MDPs), are used for sequential decision-making under uncertainty. MDPs provide a generic framework that can be applied in various domains to compute optimal policies. This thesis presents techniques that offer explanations of optimal policies for MDPs and then refine decision theoretic models (Bayesian networks and MDPs) based on feedback from experts. Explaining policies for sequential decision-making problems is difficult due to the presence of stochastic effects, multiple possibly competing objectives and long-range effects of actions. However, explanations are needed to assist experts in validating that the policy is correct and to help users in developing trust in the choices recommended by the policy. A set of domain-independent templates to justify a policy recommendation is presented along with a process to identify the minimum possible number of templates that need to be populated to completely justify the policy. The rejection of an explanation by a domain expert indicates a deficiency in the model which led to the generation of the rejected policy. Techniques to refine the model parameters such that the optimal policy calculated using the refined parameters would conform with the expert feedback are presented in this thesis. The expert feedback is translated into constraints on the model parameters that are used during refinement. These constraints are non-convex for both Bayesian networks and MDPs. For Bayesian networks, the refinement approach is based on Gibbs sampling and stochastic hill climbing, and it learns a model that obeys expert constraints. For MDPs, the parameter space is partitioned such that alternating linear optimization can be applied to learn model parameters that lead to a policy in accordance with expert feedback. In practice, the state space of MDPs can often be very large, which can be an issue for real-world problems. Factored MDPs are often used to deal with this issue. In Factored MDPs, state variables represent the state space and dynamic Bayesian networks model the transition functions. This helps to avoid the exponential growth in the state space associated with large and complex problems. The approaches for explanation and refinement presented in this thesis are also extended for the factored case to demonstrate their use in real-world applications. The domains of course advising to undergraduate students, assisted hand-washing for people with dementia and diagnostics for manufacturing are used to present empirical evaluations

University of Waterloo's Institutional Repository