10 research outputs found

    SupRB: A Supervised Rule-based Learning System for Continuous Problems

    Get PDF
    We propose the SupRB learning system, a new Pittsburgh-style learning classifier system (LCS) for supervised learning on multi-dimensional continuous decision problems. SupRB learns an approximation of a quality function from examples (consisting of situations, choices and associated qualities) and is then able to make an optimal choice as well as predict the quality of a choice in a given situation. One area of application for SupRB is parametrization of industrial machinery. In this field, acceptance of the recommendations of machine learning systems is highly reliant on operators' trust. While an essential and much-researched ingredient for that trust is prediction quality, it seems that this alone is not enough. At least as important is a human-understandable explanation of the reasoning behind a recommendation. While many state-of-the-art methods such as artificial neural networks fall short of this, LCSs such as SupRB provide human-readable rules that can be understood very easily. The prevalent LCSs are not directly applicable to this problem as they lack support for continuous choices. This paper lays the foundations for SupRB and shows its general applicability on a simplified model of an additive manufacturing problem.Comment: Submitted to the Genetic and Evolutionary Computation Conference 2020 (GECCO 2020

    Efficient Training Set Use For Blood Pressure Prediction in a Large Scale Learning Classifier System

    Get PDF
    ABSTRACT We define a machine learning problem to forecast arterial blood pressure. Our goal is to solve this problem with a large scale learning classifier system. Because learning classifiers systems are extremely computationally intensive and this problem's eventually large training set will be very costly to execute, we address how to use less of the training set while not negatively impacting learning accuracy. Our approach is to allow competition among solutions which have not been evaluated on the entire training set. The best of these solutions are then evaluated on more of the training set while their offspring start off being evaluated on less of the training set. To keep selection fair, we divide competing solutions according to how many training examples they have been tested on

    R.: Towards better than human capability in diagnosing prostate cancer using infrared spectroscopic imaging

    No full text
    Cancer diagnosis is essentially a human task. Almost universally, the process requires the extraction of tissue (biopsy) and examination of its microstructure by a human. To improve diagnoses based on limited and inconsistent morphologic knowledge, a new approach has recently been proposed that uses molecular spectroscopic imaging to utilize microscopic chemical composition for diagnoses. In contrast to visible imaging, the approach results in very large data sets as each pixel contains the entire molecular vibrational spectroscopy data from all chemical species. Here, we propose data handling and analysis strategies to allow computerbased diagnosis of human prostate cancer by applying a novel genetics-based machine learning technique (NAX). We apply this technique to demonstrate both fast learning and accurate classification that, additionally, scales well with parallelization. Preliminary results demonstrate that this approach can improve current clinical practice in diagnosing prostate cancer

    MILCS: A mutual information learning classifier system

    Get PDF
    This paper introduces a new variety of learning classifier system (LCS), called MILCS, which utilizes mutual information as fitness feedback. Unlike most LCSs, MILCS is specifically designed for supervised learning. MILCS's design draws on an analogy to the structural learning approach of cascade correlation networks. We present preliminary results, and contrast them to results from XCS. We discuss the explanatory power of the resulting rule sets, and introduce a new technique for visualizing explanatory power. Final comments include future directions for this research, including investigations in neural networks and other systems. Copyright 2007 ACM

    A New Evolutionary Algorithm For Mining Noisy, Epistatic, Geospatial Survey Data Associated With Chagas Disease

    Get PDF
    The scientific community is just beginning to understand some of the profound affects that feature interactions and heterogeneity have on natural systems. Despite the belief that these nonlinear and heterogeneous interactions exist across numerous real-world systems (e.g., from the development of personalized drug therapies to market predictions of consumer behaviors), the tools for analysis have not kept pace. This research was motivated by the desire to mine data from large socioeconomic surveys aimed at identifying the drivers of household infestation by a Triatomine insect that transmits the life-threatening Chagas disease. To decrease the risk of transmission, our colleagues at the laboratory of applied entomology and parasitology have implemented mitigation strategies (known as Ecohealth interventions); however, limited resources necessitate the search for better risk models. Mining these complex Chagas survey data for potential predictive features is challenging due to imbalanced class outcomes, missing data, heterogeneity, and the non-independence of some features. We develop an evolutionary algorithm (EA) to identify feature interactions in Big Datasets with desired categorical outcomes (e.g., disease or infestation). The method is non-parametric and uses the hypergeometric PMF as a fitness function to tackle challenges associated with using p-values in Big Data (e.g., p-values decrease inversely with the size of the dataset). To demonstrate the EA effectiveness, we first test the algorithm on three benchmark datasets. These include two classic Boolean classifier problems: (1) the majority-on problem and (2) the multiplexer problem, as well as (3) a simulated single nucleotide polymorphism (SNP) disease dataset. Next, we apply the EA to real-world Chagas Disease survey data and successfully archived numerous high-order feature interactions associated with infestation that would not have been discovered using traditional statistics. These feature interactions are also explored using network analysis. The spatial autocorrelation of the genetic data (SNPs of Triatoma dimidiata) was captured using geostatistics. Specifically, a modified semivariogram analysis was performed to characterize the SNP data and help elucidate the movement of the vector within two villages. For both villages, the SNP information showed strong spatial autocorrelation albeit with different geostatistical characteristics (sills, ranges, and nuggets). These metrics were leveraged to create risk maps that suggest the more forested village had a sylvatic source of infestation, while the other village had a domestic/peridomestic source. This initial exploration into using Big Data to analyze disease risk shows that novel and modified existing statistical tools can improve the assessment of risk on a fine-scale

    Evolutionary approaches to optimisation in rough machining

    Get PDF
    This thesis concerns the use of Evolutionary Computation to optimise the sequence and selection of tools and machining parameters in rough milling applications. These processes are not automated in current Computer-Aided Manufacturing (CAM) software and this work, undertaken in collaboration with an industrial partner, aims to address this. Related research has mainly approached tool sequence optimisation using only a single tool type, and machining parameter optimisation of a single-tool sequence. In a real world industrial setting, tools with different geometrical profiles are commonly used in combination on rough machining tasks in order to produce components with complex sculptured surfaces. This work introduces a new representation scheme and search operators to support the use of the three most commonly used tool types: end mill, ball nose and toroidal. Using these operators, single-objective metaheuristic algorithms are shown to find near-optimal solutions, while surveying only a small number of tool sequences. For the first time, a multi-objective approach is taken to tool sequence optimisation. The process of ‘multi objectivisation’ is shown to offer two benefits: escaping local optima on deceptive multimodal search spaces and providing a selection of tool sequence alternatives to a machinist. The multi-objective approach is also used to produce a varied set of near-Pareto optimal solutions, offering different trade-offs between total machining time and total tooling costs, simultaneously optimising tool sequences and the cutting speeds of individual tools. A challenge for using computationally expensive CAM software, important for real world machining, is the time cost of evaluations. An asynchronous parallel evolutionary optimisation system is presented that can provide a significant speed up, even in the presence of heterogeneous evaluation times produced by variable length tool sequences. This system uses a distributed network of processors that could be easily and inexpensively implemented on existing commercial hardware, and accessible to even small workshops

    Controlled self-organisation using learning classifier systems

    Get PDF
    The complexity of technical systems increases, breakdowns occur quite often. The mission of organic computing is to tame these challenges by providing degrees of freedom for self-organised behaviour. To achieve these goals, new methods have to be developed. The proposed observer/controller architecture constitutes one way to achieve controlled self-organisation. To improve its design, multi-agent scenarios are investigated. Especially, learning using learning classifier systems is addressed
    corecore