Search CORE

5,479 research outputs found

Relational Representations in Reinforcement Learning: Review and Open Problems

Author: Otterlo Martijn van
Publication venue: ICML
Publication date: 01/01/2002
Field of study

This paper is about representation in RL.We discuss some of the concepts in representation and generalization in reinforcement learning and argue for higher-order representations, instead of the commonly used propositional representations. The paper contains a small review of current reinforcement learning systems using higher-order representations, followed by a brief discussion. The paper ends with research directions and open problems.\u

CiteSeerX

University of Twente Research Information

Constraining the Size Growth of the Task Space with Socially Guided Intrinsic Motivation using Demonstrations

Author: Baranes Adrien
Nguyen Sao Mai
Oudeyer Pierre-Yves
Publication venue
Publication date: 01/01/2011
Field of study

This paper presents an algorithm for learning a highly redundant inverse model in continuous and non-preset environments. Our Socially Guided Intrinsic Motivation by Demonstrations (SGIM-D) algorithm combines the advantages of both social learning and intrinsic motivation, to specialise in a wide range of skills, while lessening its dependence on the teacher. SGIM-D is evaluated on a fishing skill learning experiment.Comment: JCAI Workshop on Agents Learning Interactively from Human Teachers (ALIHT), Barcelona : Spain (2011

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Automated user modeling for personalized digital libraries

Author: Aihara
Angiulli
Belkin
Bezdek
Blum
Costabile
Cristianini
E. Frias-Martinez
Fausett
Ford
Friedman
G. Magoulas
Hartigan
Haykin
Jain
Kobsa
Krishnapuram
Magoulas
Manber
Mitchell
Montaner
R. Macredie
Rabiner
Ramsey
Riecken
S. Chen
Sarukkai
Tsukada
Webb
Winter
Witten
Publication venue: 'Elsevier BV'
Publication date: 01/01/2006
Field of study

Digital libraries (DL) have become one of the most typical ways of accessing any kind of digitalized information. Due to this key role, users welcome any improvements on the services they receive from digital libraries. One trend used to improve digital services is through personalization. Up to now, the most common approach for personalization in digital libraries has been user-driven. Nevertheless, the design of efficient personalized services has to be done, at least in part, in an automatic way. In this context, machine learning techniques automate the process of constructing user models. This paper proposes a new approach to construct digital libraries that satisfy user’s necessity for information: Adaptive Digital Libraries, libraries that automatically learn user preferences and goals and personalize their interaction using this information

CiteSeerX

Crossref

Birkbeck Institutional Research Online

Brunel University Research Archive

Anytime Point-Based Approximations for Large POMDPs

Author: Gordon G.
Pineau J.
Thrun S.
Publication venue: 'AI Access Foundation'
Publication date: 04/10/2011
Field of study

The Partially Observable Markov Decision Process has long been recognized as a rich framework for real-world planning and control problems, especially in robotics. However exact solutions in this framework are typically computationally intractable for all but the smallest problems. A well-known technique for speeding up POMDP solving involves performing value backups at specific belief points, rather than over the entire belief simplex. The efficiency of this approach, however, depends greatly on the selection of points. This paper presents a set of novel techniques for selecting informative belief points which work well in practice. The point selection procedure is combined with point-based value backups to form an effective anytime POMDP algorithm called Point-Based Value Iteration (PBVI). The first aim of this paper is to introduce this algorithm and present a theoretical analysis justifying the choice of belief selection technique. The second aim of this paper is to provide a thorough empirical comparison between PBVI and other state-of-the-art POMDP methods, in particular the Perseus algorithm, in an effort to highlight their similarities and differences. Evaluation is performed using both standard POMDP domains and realistic robotic tasks

arXiv.org e-Print Archive

Crossref