75 research outputs found

    Projective simulation for classical learning agents: a comprehensive investigation

    Get PDF
    We study the model of projective simulation (PS), a novel approach to arti cial intelligence based on stochastic processing of episodic memory which was recently introduced [1]. Here we provide a detailed analysis of the model and examine its performance, including its achievable e ciency, its learning times and the way both properties scale with the problems' dimension. In addition, we situate the PS agent in di erent learning scenarios, and study its learning abilities. A variety of new scenarios are being considered, thereby demonstrating the model's exibility. Further more, to put the PS scheme in context, we compare its performance with those of Q-learning and learning classi er systems, two popular models in the eld of reinforcement learning. It is shown that PS is a competitive arti cial intelligence model of unique properties and strengths.Austrian Science Fund (FWF) SFB FoQuS F4012Templeton World Charity Foundation (TWCF

    XCS Algorithms for a Linear Combination of Discounted and Undiscounted Reward Markovian Decision Processes

    Get PDF
    RÉSUMÉ : Plusieurs études ont montré que combiner certains prédicteurs ensemble peut améliorer la justesse de la prédiction dans certains domaines comme la psychologie, les statistiques ou les sciences du management. Toutefois, aucune de ces études n'ont testé la combinaison de techniques d'apprentissage par renforcement. Notre étude vise à développer un algorithme basé sur deux algorithmes qui sont des formes approximatives d'apprentissage par renforcement répétés dans XCS. Cet algorithme, MIXCS, est une combinaison des techniques de Q-learning et de R-learning pour calculer la combinaison linéaire du payoff résultant des actions de l'agent, et aussi la correspondance entre la prédiction au niveau du système et la valeur réelle des actions de l'agent. MIXCS fait une prévision du payoff espéré pour chacune des actions disponibles pour l'agent. Nous avons testé MIXCS dans deux environnements à deux dimensions, Environment1 et Environment2, qui reproduisent les actions possibles dans un marché financier (acheter, vendre, ne rien faire) pour évaluer les performances d'un agent qui veut obtenir un profit espéré. Nous avons calculé le payoff optimal moyen dans nos deux environnements et avons comparé avec les résultats obtenus par MIXCS. Nous avons obtenu deux résultats. En premier, les résultats de MIXCS sont semblables au payoff optimal moyen pour Environments1, mais pas pour Environment2. Deuxièmement, l'agent obtient le payoff optimal moyen quand il prend l'action "vendre" dans les deux environnements.----------ABSTRACT : Many studies have shown that combining individual predictors improved the accuracy of predictions in different domains such as psychology, statistics and management sciences. However, these studies have not tested the combination of reinforcement learning techniques. This study aims to develop an algorithm based on two iterative approximate forms of reinforcement learning algorithm in XCS. This algorithm, named MIXCS, is a combination of Q-learning and R-learning techniques to compute the linear combination payoff and the correspondence between the system prediction and the action value. As such, MIXCS predicts the payoff to be expected for each possible action. We test MIXCS in two two-dimensional grids called Environment1 and Environment2, which represent financial markets actions of buying, selling and holding to evaluate the performance of an agent as a trader to gain the desired profit. We calculate the optimum average payoff to predict the value of the next movement in both Environment1 and Environment2 and compare the results with those obtained with MIXCS. The results show that the performance of MIXCS is close to optimum average reward in Environment1, but not in Environment2. Also, the agent reaches the maximum reward by taking selling actions in both Environments

    Adaptive rule-based malware detection employing learning classifier systems

    Get PDF
    Efficient and accurate malware detection is increasingly becoming a necessity for society to operate. Existing malware detection systems have excellent performance in identifying known malware for which signatures are available, but poor performance in anomaly detection for zero day exploits for which signatures have not yet been made available or targeted attacks against a specific entity. The primary goal of this thesis is to provide evidence for the potential of learning classier systems to improve the accuracy of malware detection. A customized system based on a state-of-the-art learning classier system is presented for adaptive rule-based malware detection, which combines a rule-based expert system with evolutionary algorithm based reinforcement learning, thus creating a self-training adaptive malware detection system which dynamically evolves detection rules. This system is analyzed on a benchmark of malicious and non-malicious files. Experimental results show that the system can outperform C4.5, a well-known non-adaptive machine learning algorithm, under certain conditions. The results demonstrate the system\u27s ability to learn effective rules from repeated presentations of a tagged training set and show the degree of generalization achieved on an independent test set. This thesis is an extension and expansion of the work published in the Security, Trust, and Privacy for Software Applications workshop in COMPSAC 2011 - the 35th Annual IEEE Signature Conference on Computer Software and Applications --Abstract, page iii

    Learning classifier systems from first principles: A probabilistic reformulation of learning classifier systems from the perspective of machine learning

    Get PDF
    Learning Classifier Systems (LCS) are a family of rule-based machine learning methods. They aim at the autonomous production of potentially human readable results that are the most compact generalised representation whilst also maintaining high predictive accuracy, with a wide range of application areas, such as autonomous robotics, economics, and multi-agent systems. Their design is mainly approached heuristically and, even though their performance is competitive in regression and classification tasks, they do not meet their expected performance in sequential decision tasks despite being initially designed for such tasks. It is out contention that improvement is hindered by a lack of theoretical understanding of their underlying mechanisms and dynamics.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Mixing independent classifiers

    Get PDF

    Learning classifier systems from first principles

    Get PDF

    INVESTIGATIONS INTO THE COGNITIVE ABILITIES OF ALTERNATE LEARNING CLASSIFIER SYSTEM ARCHITECTURES

    Get PDF
    The Learning Classifier System (LCS) and its descendant, XCS, are promising paradigms for machine learning design and implementation. Whereas LCS allows classifier payoff predictions to guide system performance, XCS focuses on payoff-prediction accuracy instead, allowing it to evolve optimal classifier sets in particular applications requiring rational thought. This research examines LCS and XCS performance in artificial situations with broad social/commercial parallels, created using the non-Markov Iterated Prisoner\u27s Dilemma (IPD) game-playing scenario, where the setting is sometimes asymmetric and where irrationality sometimes pays. This research systematically perturbs a conventional IPD-playing LCS-based agent until it results in a full-fledged XCS-based agent, contrasting the simulated behavior of each LCS variant in terms of a number of performance measures. The intent is to examine the XCS paradigm to understand how it better copes with a given situation (if it does) than the LCS perturbations studied.Experiment results indicate that the majority of the architectural differences do have a significant effect on the agents\u27 performance with respect to the performance measures used in this research. The results of these competitions indicate that while each architectural difference significantly affected its agent\u27s performance, no single architectural difference could be credited as causing XCS\u27s demonstrated superiority in evolving optimal populations. Instead, the data suggests that XCS\u27s ability to evolve optimal populations in the multiplexer and IPD problem domains result from the combined and synergistic effects of multiple architectural differences.In addition, it is demonstrated that XCS is able to reliably evolve the Optimal Population [O] against the TFT opponent. This result supports Kovacs\u27 Optimality Hypothesis in the IPD environment and is significant because it is the first demonstrated occurrence of this ability in an environment other than the multiplexer and Woods problem domains.It is therefore apparent that while XCS performs better than its LCS-based counterparts, its demonstrated superiority may not be attributed to a single architectural characteristic. Instead, XCS\u27s ability to evolve optimal classifier populations in the multiplexer problem domain and in the IPD problem domain studied in this research results from the combined and synergistic effects of multiple architectural differences

    Constructing Complexity-efficient Features in XCS with Tree-based Rule Conditions

    Full text link
    A major goal of machine learning is to create techniques that abstract away irrelevant information. The generalisation property of standard Learning Classifier System (LCS) removes such information at the feature level but not at the feature interaction level. Code Fragments (CFs), a form of tree-based programs, introduced feature manipulation to discover important interactions, but they often contain irrelevant information, which causes structural inefficiency. XOF is a recently introduced LCS that uses CFs to encode building blocks of knowledge about feature interaction. This paper aims to optimise the structural efficiency of CFs in XOF. We propose two measures to improve constructing CFs to achieve this goal. Firstly, a new CF-fitness update estimates the applicability of CFs that also considers the structural complexity. The second measure we can use is a niche-based method of generating CFs. These approaches were tested on Even-parity and Hierarchical problems, which require highly complex combinations of input features to capture the data patterns. The results show that the proposed methods significantly increase the structural efficiency of CFs, which is estimated by the rule "generality rate". This results in faster learning performance in the Hierarchical Majority-on problem. Furthermore, a user-set depth limit for CF generation is not needed as the learning agent will not adopt higher-level CFs once optimal CFs are constructed
    • …
    corecore