Search CORE

7 research outputs found

Multitask reinforcement learning on the distribution of mdps,”

Author: Fumihide Tanaka
Masayuki Yamamura
Publication venue: IEEE,
Publication date: 01/01/2003
Field of study

Abstract In this paper we address a new problem in reinforcement learning. Here we consider an agent that faces multiple learning tasks within its lifetime. The agent's objective is to maximize its total reward in the lifetime as well as a conventional return in each task. To realize this, it has to be endowed an important ability to keep its past learning experiences and utilize them for improving future learning performance. This time we try to phrase this problem formally. The central idea is to introduce an environmental class, BV-MDPs that is defined with the distribution of MDPs. As an approach to exploiting past learning experiences, we focus on statistics (mean and deviation) about the agent's value tables. The mean can be used as initial values of the table when a new task is presented. The deviation can be viewed as measuring reliability of the mean, and we utilize it in calculating priority of simulated backups. We conduct experiments in computer simulation to evaluate the effectiveness

CiteSeerX

Multi-task reinforcement learning: shaping and feature selection

Author: Snel M.
Whiteson S.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

International Migration, Integration and Social Cohesion online publications

Motion Planning by Integration of Multiple Policies for Complex Assembly Tasks

Author: Hiromitsu Fujii
Natsuki Yamanobe
Ryuichi Ueda
Tamio Arai
Publication venue: 'IntechOpen'
Publication date: 01/10/2010
Field of study

IntechOpen

Generalization strategies in reinforcement learning

Author: Snel M.
Publication venue
Publication date: 01/01/2018
Field of study

International Migration, Integration and Social Cohesion online publications

Multitask Reinforcement Learning on the Distribution of MDPs

Author: YAMAMURA MASAYUKI
山村雅幸
Publication venue: Proc. IEEE International Conference in Robotics and Automation (CIRA2003)
Publication date: 30/11/2006
Field of study

Institutional Repositories DataBase (IRDB)