Search CORE

39,840 research outputs found

Improving the Performance of Complex Agent Plans Through Reinforcement Learning

Author: Iocchi Luca
Leonetti Matteo
Publication venue: Dagstuhl Seminar Proceedings. 10081 - Cognitive Robotics
Publication date: 01/01/2010
Field of study

Agent programming in complex, partially observable and stochastic domains usually requires a great deal of understanding of both the domain and the task, in order to provide the agent with the knowledge necessary to act effectively. While symbolic methods allow the designer to specify declarative knowledge about the domain, the resulting plan can be brittle since it is difficult to supply a symbolic model that is accurate enough to foresee all possible events in complex environments, especially in the case of partial observability. Reinforcement Learning (RL) techniques, on the other hand, can learn a policy and make use of a learned model, but it is difficult to reduce and shape the scope of the learning algorithm by exploiting a priori information. We propose a methodology for writing complex agent programs that can be effectively improved through experience. We show how to derive a stochastic process from a partial specification of the plan, so that the latter's perfomance can be improved solving a RL problem much smaller than classical RL formulations. Finally, we demonstrate our approach in the context of Keepaway Soccer, a common RL benchmark based on a RoboCup Soccer 2D simulator. Copyright © 2010, International Foundation for Autonomous Agents and Multiagent Systems (www.ifaamas.org). All rights reserved

Dagstuhl Research Online Publication Server

Archivio della ricerca- Università di Roma La Sapienza

Robot pain: a speculative review of its functions

Author: Torras Carme
Publication venue: 'Ovid Technologies (Wolters Kluwer Health)'
Publication date: 01/01/2016
Field of study

Given the scarce bibliography dealing explicitly with robot pain, this chapter has enriched its review with related research works about robot behaviours and capacities in which pain could play a role. It is shown that all such roles ¿ranging from punishment to intrinsic motivation and planning knowledge¿ can be formulated within the unified framework of reinforcement learning.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning

Author: Farquhar Gregory
Igl Maximilian
Rocktäschel Tim
Whiteson Shimon
Publication venue
Publication date: 08/03/2018
Field of study

Combining deep model-free reinforcement learning with on-line planning is a promising approach to building on the successes of deep RL. On-line planning with look-ahead trees has proven successful in environments where transition models are known a priori. However, in complex environments where transition models need to be learned from data, the deficiencies of learned models have limited their utility for planning. To address these challenges, we propose TreeQN, a differentiable, recursive, tree-structured model that serves as a drop-in replacement for any value function network in deep RL with discrete actions. TreeQN dynamically constructs a tree by recursively applying a transition model in a learned abstract state space and then aggregating predicted rewards and state-values using a tree backup to estimate Q-values. We also propose ATreeC, an actor-critic variant that augments TreeQN with a softmax layer to form a stochastic policy network. Both approaches are trained end-to-end, such that the learned model is optimised for its actual use in the tree. We show that TreeQN and ATreeC outperform n-step DQN and A2C on a box-pushing task, as well as n-step DQN and value prediction networks (Oh et al. 2017) on multiple Atari games. Furthermore, we present ablation studies that demonstrate the effect of different auxiliary losses on learning transition models

arXiv.org e-Print Archive

UCL Discovery