Search CORE

21 research outputs found

A brief history of learning classifier systems: from CS-1 to XCS and its variants

Author: A Fernández
A Fraser
A Orriols-Puig
A Tomlinson
AL Samuel
AL Samuel
B Farley
C Fernando
C Shannon
C Stone
D Cliff
E Bernado Mansilla
G Box
H Dam
J Casillas
J Greensmith
J Hoffmann
J Seward
J Timmis
JD Farmer
JH Holland
JH Holland
JH Holland
L Booker
L Bull
L Bull
L Bull
L Castro De
Larry Bull
M Iqbal
M Iqbal
M Studley
MV Butz
MV Butz
MV Butz
MV Butz
MV Butz
MV Butz
MV Butz
N Coufal
P Frey
P Stalph
P Stalph
P-L Lanzi
P-L Lanzi
P-L Lanzi
R Preen
R Smith
R Smith
R Sutton
R Tibshirani
R Urbanowicz
S Becker
S Vijayakumar
SW Wilson
SW Wilson
SW Wilson
W Schultz
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 26/09/2015
Field of study

© 2015, Springer-Verlag Berlin Heidelberg. The direction set by Wilson’s XCS is that modern Learning Classifier Systems can be characterized by their use of rule accuracy as the utility metric for the search algorithm(s) discovering useful rules. Such searching typically takes place within the restricted space of co-active rules for efficiency. This paper gives an overview of the evolution of Learning Classifier Systems up to XCS, and then of some of the subsequent developments of Wilson’s algorithm to different types of learning

Crossref

UWE Bristol Research Repository

ZCS redux

Author: Bull Larry
Hurst Jacob
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2002
Field of study

Learning classifier systems traditionally use genetic algorithms to facilitate rule discovery, where rule fitness is payoff based. Current research has shifted to the use of accuracy-based fitness. This paper re-examines the use of a particular payoff-based learning classifier system - ZCS. By using simple difference equation models of ZCS, we show that this system is capable of optimal performance subject to appropriate parameter settings. This is demonstrated for both single- and multistep tasks. Optimal performance of ZCS in well-known, multistep maze tasks is then presented to support the findings from the models

Crossref

UWE Bristol Research Repository

Learning Mazes with Aliasing States: An LCS Algorithm with Associative Perception

Author: Anthony Bagnall
Bull L.
Bull L.
Butz M.V.
Butz M.V.
Cassandra A.R.
Gerard P.
Hoffman J.
Holland J.H.
Holmes M.
Hurst J.
Lanzi P.L.
Lanzi P.L.
Lanzi P.L.
Littman M.L.
Littman M.L.
McCallum A.R.
Miyazaki K.
Métivier M.
Nevison C.
O'Hara T.
Pavlov I.P.
Pear J.
Russell S.
Skinner B.F.
Studley M.
Sutton R.S.
Thorndike E.L.
Zatuchna Z.V.
Zatuchna Z.V.
Zatuchna Z.V.
Zhanna V. Zatuchna
Publication venue: 'SAGE Publications'
Publication date: 09/03/2009
Field of study

Learning classifier systems (LCSs) belong to a class of algorithms based on the principle of self-organization and have frequently been applied to the task of solving mazes, an important type of reinforcement learning (RL) problem. Maze problems represent a simplified virtual model of real environments that can be used for developing core algorithms of many real-world applications related to the problem of navigation. However, the best achievements of LCSs in maze problems are still mostly bounded to non-aliasing environments, while LCS complexity seems to obstruct a proper analysis of the reasons of failure. We construct a new LCS agent that has a simpler and more transparent performance mechanism, but that can still solve mazes better than existing algorithms. We use the structure of a predictive LCS model, strip out the evolutionary mechanism, simplify the reinforcement learning procedure and equip the agent with the ability of associative perception, adopted from psychology. To improve our understanding of the nature and structure of maze environments, we analyze mazes used in research for the last two decades, introduce a set of maze complexity characteristics, and develop a set of new maze environments. We then run our new LCS with associative perception through the old and new aliasing mazes, which represent partially observable Markov decision problems (POMDP) and demonstrate that it performs at least as well as, and in some cases better than, other published systems

Crossref

University of East Anglia digital repository

Symbiogenesis in learning classifier systems

Author: Andy Tomlinson
Larry Bull
Lindgren K.
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2001
Field of study

Abstract Symbiosis is the phenomenon in which organisms of different species live together in close association, resulting in a raised level of fitness for one or more of the organisms. Symbiogenesis is the name given to the process by which symbiotic partners combine and unify, that is, become genetically linked, giving rise to new morphologies and physiologies evolutionarily more advanced than their constituents. The importance of this process in the evolution of complexity is now well established. Learning classifier systems are a machine learning technique that uses both evolutionary computing techniques and reinforcement learning to develop a population of cooperative rules to solve a given task. In this article we examine the use of symbiogenesis within the classifier system rule base to improve their performance. Results show that incorporating simple rule linkage does not give any benefits. The concept of (temporal) encapsulation is then added to the symbiotic rules and shown to improve performance in ambiguous/non-Markov environments

CiteSeerX

Crossref

UWE Bristol Research Repository

Learning Classifier Systems: A Complete Introduction, Review, and Roadmap

Author: Jason H. Moore
Ryan J. Urbanowicz
Publication venue: 'Hindawi Limited'
Publication date
Field of study

Crossref

Evolutionary Strategies for Data Mining

Author: Lowe Rose
Publication venue: Clemson University Libraries
Publication date: 01/12/2010
Field of study

Learning classifier systems (LCS) have been successful in generating rules for solving classification problems in data mining. The rules are of the form IF condition THEN action. The condition encodes the features of the input space and the action encodes the class label. What is lacking in those systems is the ability to express each feature using a function that is appropriate for that feature. The genetic algorithm is capable of doing this but cannot because only one type of membership function is provided. Thus, the genetic algorithm learns only the shape and placement of the membership function, and in some cases, the number of partitions generated by this function. The research conducted in this study employs a learning classifier system to generate the rules for solving classification problems, but also incorporates multiple types of membership functions, allowing the genetic algorithm to choose an appropriate one for each feature of the input space and determine the number of partitions generated by each function. In addition, three membership functions were introduced. This paper describes the framework and implementation of this modified learning classifier system (M-LCS). Using the M-LCS model, classifiers were simulated for two benchmark classification problems and two additional real-world problems. The results of these four simulations indicate that the M-LCS model provides an alternative approach to designing a learning classifier system. The following contributions are made to the field of computing: 1) a framework for developing a learning classifier system that employs multiple types of membership functions, 2) a model, M-LCS, that was developed from the framework, and 3) the addition of three membership functions that have not been used in the design of learning classifier systems

Clemson University: TigerPrints

XCS Performance and Population STRUCTURE IN MULTI-STEP ENVIRONMENTS

Author: Alwyn Barry
Publication venue
Publication date: 01/01/2000
Field of study

SIGLEAvailable from British Library Document Supply Centre-DSC:DXN039134 / BLDSC - British Library Document Supply CentreGBUnited Kingdo

CiteSeerX

OpenGrey Repository

Design and Investigation of a Multi Agent Based XCS Learning Classifier System with Distributed Rules

Author: Pinseler Mirko
Publication venue
Publication date: 27/02/2018
Field of study

This thesis has introduced and investigated a new kind of rule-based evolutionary online learning system. It addressed the problem of distributing the knowledge of a Learning Classifier System, that is represented by a population of classifiers. The result is a XCS-derived Learning Classifier System 'XCS with Distributed Rules' (XCS-DR) that introduces independent, interacting agents to distribute the system's acquired knowledge evenly. The agents act collaboratively to solve problem instances at hand. XCS-DR's design and architecture have been explained and its classification performance has been evaluated and scrutinized in detail in this thesis. While not reaching optimal performance, compared to the original XCS, it could be shown that XCS-DR still yields satisfactory classification results. It could be shown that in the simple case of applying only one agent, the introduced system performs as accurately as XCS

Qucosa - Publikationsserver der Universität Leipzig

Improving the Practicality of Model-Based Reinforcement Learning: An Investigation into Scaling up Model-Based Methods in Online Settings

Author: Stafford Ronnie James
Publication venue: UCL (University College London)
Publication date: 28/01/2020
Field of study

This thesis is a response to the current scarcity of practical model-based control algorithms in the reinforcement learning (RL) framework. As of yet there is no consensus on how best to integrate imperfect transition models into RL whilst mitigating policy improvement instabilities in online settings. Current state-of-the-art policy learning algorithms that surpass human performance often rely on model-free approaches that enjoy unmitigated sampling of transition data. Model-based RL (MBRL) instead attempts to distil experience into transition models that allow agents to plan new policies without needing to return to the environment and sample more data. The initial focus of this investigation is on kernel conditional mean embeddings (CMEs) (Song et al., 2009) deployed in an approximate policy iteration (API) algorithm (Grünewälder et al., 2012a). This existing MBRL algorithm boasts theoretically stable policy updates in continuous state and discrete action spaces. The Bellman operator’s value function and (transition) conditional expectation are modelled and embedded respectively as functions in a reproducing kernel Hilbert space (RKHS). The resulting finite-induced approximate pseudo-MDP (Yao et al., 2014a) can be solved exactly in a dynamic programming algorithm with policy improvement suboptimality guarantees. However model construction and policy planning scale cubically and quadratically respectively with the training set size, rendering the CME impractical for sampleabundant tasks in online settings. Three variants of CME API are investigated to strike a balance between stable policy updates and reduced computational complexity. The first variant models the value function and state-action representation explicitly in a parametric CME (PCME) algorithm with favourable computational complexity. However a soft conservative policy update technique is developed to mitigate policy learning oscillations in the planning process. The second variant returns to the non-parametric embedding and contributes (along with external work) to the compressed CME (CCME); a sparse and computationally more favourable CME. The final variant is a fully end-to-end differentiable embedding trained with stochastic gradient updates. The value function remains modelled in an RKHS such that backprop is driven by a non-parametric RKHS loss function. Actively compressed CME (ACCME) satisfies the pseudo-MDP contraction constraint using a sparse softmax activation function. The size of the pseudo-MDP (i.e. the size of the embedding’s last layer) is controlled by sparsifying the last layer weight matrix by extending the truncated gradient method (Langford et al., 2009) with group lasso updates in a novel ‘use it or lose it’ neuron pruning mechanism. Surprisingly this technique does not require extensive fine-tuning between control tasks

UCL Discovery