Search CORE

560 research outputs found

The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces

Author: A.G. Barto
Andrew W. Moore
Christopher G. Atkeson
J. Simons
R.E. Bellman
R.S. Sutton
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Metric State Space Reinforcement Learning for a Vision-Capable Mobile Robot

Author: Gomez Faustino
Hutter Marcus
Schmidhuber Juergen
Zhumatiy Viktor
Publication venue
Publication date: 01/01/2006
Field of study

We address the problem of autonomously learning controllers for vision-capable mobile robots. We extend McCallum's (1995) Nearest-Sequence Memory algorithm to allow for general metrics over state-action trajectories. We demonstrate the feasibility of our approach by successfully running our algorithm on a real mobile robot. The algorithm is novel and unique in that it (a) explores the environment and learns directly on a mobile robot without using a hand-made computer model as an intermediate step, (b) does not require manual discretization of the sensor input space, (c) works in piecewise continuous perceptual spaces, and (d) copes with partial observability. Together this allows learning from much less experience compared to previous methods.Comment: 14 pages, 8 figure

arXiv.org e-Print Archive

CiteSeerX

The Australian National University

Reinforcement Learning: A Survey

Author: Kaelbling L. P.
Littman M. L.
Moore A. W.
Publication venue
Publication date: 01/01/1996
Field of study

This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but differs considerably in the details and in the use of the word ``reinforcement.'' The paper discusses central issues of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.Comment: See http://www.jair.org/ for any accompanying file

arXiv.org e-Print Archive

CiteSeerX

Recommended from our members

An Architecture for Multilevel Learning and Robotic Control based on Concept Generation

Author: Iravani Pejman
Publication venue
Publication date: 01/01/2005
Field of study

Robot and multi-robot systems are inherently complex systems, for which designing the programs to control their behaviours proves complicated. Moreover, control programs that have been successfully designed for a particular environment and task can become useless if either of these change. It is for this reason that this thesis investigates the use of machine learning within robot and multi-robot systems. It explores an architecture for machine learning, applied to autonomous mobile robots based on dividing the learning task into two individual but interleaved sub-tasks. The first sub-task consists of finding an appropriate representation on which to base behaviour learning. The thesis explores the viability of using multidimensional classification techniques to generalise the original sensor and motor representations into abstract hierarchies of 'concepts'. To construct concepts the research used standard classification techniques, and experimented with a novel method of multidimensional data classification based on 'Q-analysis'. Results suggest that this may be a powerful new approach to concept learning. The second sub-task consists of using the previously acquired concepts as the representation for behaviour learning. The thesis explores whether it is possible to learn robotic behaviours represented using concepts. Results show that is possible to learn low-level behaviours such as navigation and higher-level ones such as ball passing in robot football. The thesis concludes that the proposed architecture is viable for robotic behaviour learning and control, and that incorporating Q-analysis based classification results in a promising new approach to the control of robot and multi-robot systems

Open Research Online (The Open University)

OpenGrey Repository

Closed-Loop Learning of Visual Control Policies

Author: Jodogne S. R.
Piater J. H.
Publication venue: 'AI Access Foundation'
Publication date: 01/01/2007
Field of study

In this paper we present a general, flexible framework for learning mappings from images to actions by interacting with the environment. The basic idea is to introduce a feature-based image classifier in front of a reinforcement learning algorithm. The classifier partitions the visual space according to the presence or absence of few highly informative local descriptors that are incrementally selected in a sequence of attempts to remove perceptual aliasing. We also address the problem of fighting overfitting in such a greedy algorithm. Finally, we show how high-level visual features can be generated when the power of local descriptors is insufficient for completely disambiguating the aliased states. This is done by building a hierarchy of composite features that consist of recursive spatial combinations of visual features. We demonstrate the efficacy of our algorithms by solving three visual navigation tasks and a visual version of the classical Car on the Hill control problem

arXiv.org e-Print Archive

CiteSeerX

Crossref

Open Repository and Bibliography - Liège

DIAL UCLouvain

A Graph-Based Reinforcement Learning Method with Converged State Exploration and Exploitation

Author: Chen Tianding
Jiang Yingtao
Li Han
Teng Hualiang
Publication venue: Digital Scholarship@UNLV
Publication date: 01/01/2019
Field of study

In any classical value-based reinforcement learning method, an agent, despite of its continuous interactions with the environment, is yet unable to quickly generate a complete and independent description of the entire environment, leaving the learning method to struggle with a difficult dilemma of choosing between the two tasks, namely exploration and exploitation. This problem becomes more pronounced when the agent has to deal with a dynamic environment, of which the configuration and/or parameters are constantly changing. In this paper, this problem is approached by first mapping a reinforcement learning scheme to a directed graph, and the set that contains all the states already explored shall continue to be exploited in the context of such a graph. We have proved that the two tasks of exploration and exploitation eventually converge in the decision-making process, and thus, there is no need to face the exploration vs. exploitation tradeoff as all the existing reinforcement learning methods do. Rather this observation indicates that a reinforcement learning scheme is essentially the same as searching for the shortest path in a dynamic environment, which is readily tackled by a modified Floyd-Warshall algorithm as proposed in the paper. The experimental results have confirmed that the proposed graph-based reinforcement learning algorithm has significantly higher performance than both standard Q-learning algorithm and improved Q-learning algorithm in solving mazes, rendering it an algorithm of choice in applications involving dynamic environments

University of Nevada, Las Vegas Repository

Growing Action Spaces

Author: Farquhar Gregory
Gustafson Laura
Lin Zeming
Synnaeve Gabriel
Usunier Nicolas
Whiteson Shimon
Publication venue
Publication date: 28/06/2019
Field of study

In complex tasks, such as those with large combinatorial action spaces, random exploration may be too inefficient to achieve meaningful learning progress. In this work, we use a curriculum of progressively growing action spaces to accelerate learning. We assume the environment is out of our control, but that the agent may set an internal curriculum by initially restricting its action space. Our approach uses off-policy reinforcement learning to estimate optimal value functions for multiple action spaces simultaneously and efficiently transfers data, value estimates, and state representations from restricted action spaces to the full task. We show the efficacy of our approach in proof-of-concept control tasks and on challenging large-scale StarCraft micromanagement tasks with large, multi-agent action spaces

arXiv.org e-Print Archive

Oxford University Research Archive

VQQL. Applying vector quantization to reinforcement learning

Author: Borrajo Millán Daniel
Fernández Fernando
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2000
Field of study

Proceeding of: RoboCup-99: Robot Soccer World Cup III, July 27 to August 6, 1999, Stockholm, SwedenReinforcement learning has proven to be a set of successful techniques for finding optimal policies on uncertain and/or dynamic domains, such as the RoboCup. One of the problems on using such techniques appears with large state and action spaces, as it is the case of input information coming from the Robosoccer simulator. In this paper, we describe a new mechanism for solving the states generalization problem in reinforcement learning algorithms. This clustering mechanism is based on the vector quantization technique for signal analog-to-digital conversion and compression, and on the Generalized Lloyd Algorithm for the design of vector quantizers. Furthermore, we present the VQQL model, that integrates Q-Learning as reinforcement learning technique and vector quantization as state generalization technique. We show some results on applying this model to learning the interception task skill for Robosoccer agents.Publicad

CiteSeerX

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo