Search CORE

34,185 research outputs found

Evolutionary Algorithms for Reinforcement Learning

Author: Grefenstette J. J.
Moriarty D. E.
Schultz A. C.
Publication venue: 'AI Access Foundation'
Publication date: 01/06/2011
Field of study

There are two distinct approaches to solving reinforcement learning problems, namely, searching in value function space and searching in policy space. Temporal difference methods and evolutionary algorithms are well-known examples of these approaches. Kaelbling, Littman and Moore recently provided an informative survey of temporal difference methods. This article focuses on the application of evolutionary algorithms to the reinforcement learning problem, emphasizing alternative policy representations, credit assignment methods, and problem-specific genetic operators. Strengths and weaknesses of the evolutionary approach to reinforcement learning are presented, along with a survey of representative applications

arXiv.org e-Print Archive

Crossref

Reactive with tags classifier system applied to real robot navigation

Author: Isasi Pedro
Molina López José Manuel
Sanchis de Miguel María Araceli
Segovia Javier
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1999
Field of study

7th IEEE International Conference on Emerging Technologies and Factory Automation. Barcelona, 18-21 October 1999.A reactive with tags classifier system (RTCS) is a special classifier system. This system combines the execution capabilities of symbolic systems and the learning capabilities of genetic algorithms. A RTCS is able to learn symbolic rules that allow to generate sequence of actions, chaining rules among different time instants, and react to new environmental situations, considering the last environmental situation to take a decision. The capacity of RTCS to learn good rules has been prove in robotics navigation problem. Results show the suitability of this approximation to the navigation problem and the coherence of extracted rules

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

An enhanced classifier system for autonomous robot navigation in dynamic environments

Author: Berlanga de Jesús Antonio
Isasi Pedro
Molina López José Manuel
Sanchis de Miguel María Araceli
Publication venue: TSI Press, San Antonio, Texas, USA.
Publication date: 01/01/2000
Field of study

In many cases, a real robot application requires the navigation in dynamic environments. The navigation problem involves two main tasks: to avoid obstacles and to reach a goal. Generally, this problem could be faced considering reactions and sequences of actions. For solving the navigation problem a complete controller, including actions and reactions, is needed. Machine learning techniques has been applied to learn these controllers. Classifier Systems (CS) have proven their ability of continuos learning in these domains. However, CS have some problems in reactive systems. In this paper, a modified CS is proposed to overcome these problems. Two special mechanisms are included in the developed CS to allow the learning of both reactions and sequences of actions. The learning process has been divided in two main tasks: first, the discrimination between a predefined set of rules and second, the discovery of new rules to obtain a successful operation in dynamic environments. Different experiments have been carried out using a mini-robot Khepera to find a generalised solution. The results show the ability of the system to continuous learning and adaptation to new situations.Publicad

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

Distributed ARTMAP

Author: Carpenter Gail A.
Milenova Boriana L.
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 01/05/1999
Field of study

Distributed coding at the hidden layer of a multi-layer perceptron (MLP) endows the network with memory compression and noise tolerance capabilities. However, an MLP typically requires slow off-line learning to avoid catastrophic forgetting in an open input environment. An adaptive resonance theory (ART) model is designed to guarantee stable memories even with fast on-line learning. However, ART stability typically requires winner-take-all coding, which may cause category proliferation in a noisy input environment. Distributed ARTMAP (dARTMAP) seeks to combine the computational advantages of MLP and ART systems in a real-time neural network for supervised learning. This system incorporates elements of the unsupervised dART model as well as new features, including a content-addressable memory (CAM) rule. Simulations show that dARTMAP retains fuzzy ARTMAP accuracy while significantly improving memory compression. The model's computational learning rules correspond to paradoxical cortical data.Office of Naval Research (N00014-95-1-0409, N00014-95-1-0657

Boston University Institutional Repository (OpenBU)

Credit Assignment in Adaptive Evolutionary Algorithms

Author: Pham Dr Tuan Q
Sarker Dr Ruhul A
Whitacre Dr James M
Publication venue
Publication date: 08/07/2006
Field of study

In this paper, a new method for assigning credit to search\ud operators is presented. Starting with the principle of optimizing\ud search bias, search operators are selected based on an ability to\ud create solutions that are historically linked to future generations.\ud Using a novel framework for defining performance\ud measurements, distributing credit for performance, and the\ud statistical interpretation of this credit, a new adaptive method is\ud developed and shown to outperform a variety of adaptive and\ud non-adaptive competitors

arXiv.org e-Print Archive

A reactive approach to classifier systems

Author: Isasi Pedro
Molina López José Manuel
Sanchis de Miguel María Araceli
Sevilla Carlos
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1998
Field of study

IEEE International Conference on Systems, Man, and Cybernetics. San Diego, CA, 11-14 Oct. 1998The navigation problem involves how to reach a goal avoiding obstacles in dynamic environments. This problem can be faced considering reactions and/or sequences of actions. Classifier Systems (CS) have proven their ability of continuous learning, however they have some problems in reactive systems. A modified CS is proposed to overcome these problems. Two special mechanisms are included in the developed CS to allow the learning of both reactions and sequences of actions. This learning process involves two main tasks: first, discriminating between rules and second, the discovery of new rules to obtain a successful operation in dynamic environments. Different experiments have been carried out using a mini-robot Khepera to find a generalized solution. The results show the ability of the system for continuous learning and adaptation to new situations

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

Applying classifier systems to learn the reactions in mobile robots

Author: Isasi Pedro
Molina López José Manuel
Sanchis de Miguel María Araceli
Segovia Javier
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2001
Field of study

The navigation problem involves how to reach a goal avoiding obstacles in dynamic environments. This problem can be faced considering reactions and sequences of actions. Classifier systems (CSs) have proven their ability of continuous learning, however, they have some problems in reactive systems. A modified CS, namely a reactive classifier system (RCS), is proposed to overcome those problems. Two special mechanisms are included in the RCS: the non-existence of internal cycles inside the CS (no internal cycles) and the fusion of environmental message with the messages posted to the message list in the previous instant (generation list through fusion). These mechanisms allow the learning of both reactions and sequences of actions. This learning process involves two main tasks: first, discriminate between rules and, second, the discovery of new rules to obtain a successful operation in dynamic environments. DiVerent experiments have been carried out using a mini-robot Khepera to find a generalized solution. The results show the ability of the system for continuous learning and adaptation to new situations.Publicad

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

Learning sequences of rules using classifier systems with tags

Author: Isasi Pedro
Molina López José Manuel
Sanchis de Miguel María Araceli
Segovia Javier
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/1999
Field of study

IEEE International Conference on Systems, Man, and Cybernetics. Tokyo, 12-15 October 1999.The objective of this paper was to obtain an encoding structure that would allow the genetic evolution of rules in such a manner that the number of rules and relationship in a classifier system (CS) would be learnt in the evolution process. For this purpose, an area that allows the definition of rule groups has been entered into the condition and message part of the encoded rules. This area is called internal tag. This term was coined because the system has some similarities with natural processes that take place in certain animal species, where the existence of tags allows them to communicate and recognize each other. Such CS is called a tag classifier system (TCS). The TCS has been tested in the game of draughts and compared with the classical CS. The results show an improving of the CS performance

Universidad Carlos III de Madrid e-Archivo

Sustainable Cooperative Coevolution with a Multi-Armed Bandit

Author: De Rainville François-Michel
Gagné Christian
Laurendeau Denis
Schoenauer Marc
Sebag Michèle
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2013
Field of study

This paper proposes a self-adaptation mechanism to manage the resources allocated to the different species comprising a cooperative coevolutionary algorithm. The proposed approach relies on a dynamic extension to the well-known multi-armed bandit framework. At each iteration, the dynamic multi-armed bandit makes a decision on which species to evolve for a generation, using the history of progress made by the different species to guide the decisions. We show experimentally, on a benchmark and a real-world problem, that evolving the different populations at different paces allows not only to identify solutions more rapidly, but also improves the capacity of cooperative coevolution to solve more complex problems.Comment: Accepted at GECCO 201

arXiv.org e-Print Archive

HAL-CentraleSupelec

Crossref

INRIA a CCSD electronic archive server

HAL-Rennes 1