4,010 research outputs found

    Applying genetic programming to learn spatial differences between textures using a translation invariant representation

    Get PDF
    This paper describes an approach to evolving texture feature extraction programs using tree based genetic programming. The programs are evolved from a learning set of 13 textures selected from the Brodatz database. In the evolutionary phase, texture images are first "binarised" to 256 grey levels. An encoding of the positions of the black pixels is used as the input to the evolved programs. A separate feature extraction program is evolved for each of the 256 grey levels. Fitness is measured by applying the evolved program to all of the images in the learning set, using one dimensional clustering on the outputs and then using the separation between the clusters as the fitness value. On two benchmark problems using the evolved programs for feature extraction and a nearest neighbour classifier, the evolved features gave test accuracies of 74.6% and 66.2% respectively for a 13 Brodatz and a 15 Vistex texture problem. This is better than a number of human derived methods on the same problems

    Genetic programming applied to morphological image processing

    Get PDF
    This thesis presents three approaches to the automatic design of algorithms for the processing of binary images based on the Genetic Programming (GP) paradigm. In the first approach the algorithms are designed using the basic Mathematical Morphology (MM) operators, i.e. erosion and dilation, with a variety of Structuring Elements (SEs). GP is used to design algorithms to convert a binary image into another containing just a particular characteristic of interest. In the study we have tested two similarity fitness functions, training sets with different numbers of elements and different sizes of the training images over three different objectives. The results of the first approach showed some success in the evolution of MM algorithms but also identifed problems with the amount of computational resources the method required. The second approach uses Sub-Machine-Code GP (SMCGP) and bitwise operators as an attempt to speed-up the evolution of the algorithms and to make them both feasible and effective. The SMCGP approach was successful in the speeding up of the computation but it was not successful in improving the quality of the obtained algorithms. The third approach presents the combination of logical and morphological operators in an attempt to improve the quality of the automatically designed algorithms. The results obtained provide empirical evidence showing that the evolution of high quality MM algorithms using GP is possible and that this technique has a broad potential that should be explored further. This thesis includes an analysis of the potential of GP and other Machine Learning techniques for solving the general problem of Signal Understanding by means of exploring Mathematical Morphology

    A Field Guide to Genetic Programming

    Get PDF
    xiv, 233 p. : il. ; 23 cm.Libro ElectrónicoA Field Guide to Genetic Programming (ISBN 978-1-4092-0073-4) is an introduction to genetic programming (GP). GP is a systematic, domain-independent method for getting computers to solve problems automatically starting from a high-level statement of what needs to be done. Using ideas from natural evolution, GP starts from an ooze of random computer programs, and progressively refines them through processes of mutation and sexual recombination, until solutions emerge. All this without the user having to know or specify the form or structure of solutions in advance. GP has generated a plethora of human-competitive results and applications, including novel scientific discoveries and patentable inventions. The authorsIntroduction -- Representation, initialisation and operators in Tree-based GP -- Getting ready to run genetic programming -- Example genetic programming run -- Alternative initialisations and operators in Tree-based GP -- Modular, grammatical and developmental Tree-based GP -- Linear and graph genetic programming -- Probalistic genetic programming -- Multi-objective genetic programming -- Fast and distributed genetic programming -- GP theory and its applications -- Applications -- Troubleshooting GP -- Conclusions.Contents xi 1 Introduction 1.1 Genetic Programming in a Nutshell 1.2 Getting Started 1.3 Prerequisites 1.4 Overview of this Field Guide I Basics 2 Representation, Initialisation and GP 2.1 Representation 2.2 Initialising the Population 2.3 Selection 2.4 Recombination and Mutation Operators in Tree-based 3 Getting Ready to Run Genetic Programming 19 3.1 Step 1: Terminal Set 19 3.2 Step 2: Function Set 20 3.2.1 Closure 21 3.2.2 Sufficiency 23 3.2.3 Evolving Structures other than Programs 23 3.3 Step 3: Fitness Function 24 3.4 Step 4: GP Parameters 26 3.5 Step 5: Termination and solution designation 27 4 Example Genetic Programming Run 4.1 Preparatory Steps 29 4.2 Step-by-Step Sample Run 31 4.2.1 Initialisation 31 4.2.2 Fitness Evaluation Selection, Crossover and Mutation Termination and Solution Designation Advanced Genetic Programming 5 Alternative Initialisations and Operators in 5.1 Constructing the Initial Population 5.1.1 Uniform Initialisation 5.1.2 Initialisation may Affect Bloat 5.1.3 Seeding 5.2 GP Mutation 5.2.1 Is Mutation Necessary? 5.2.2 Mutation Cookbook 5.3 GP Crossover 5.4 Other Techniques 32 5.5 Tree-based GP 39 6 Modular, Grammatical and Developmental Tree-based GP 47 6.1 Evolving Modular and Hierarchical Structures 47 6.1.1 Automatically Defined Functions 48 6.1.2 Program Architecture and Architecture-Altering 50 6.2 Constraining Structures 51 6.2.1 Enforcing Particular Structures 52 6.2.2 Strongly Typed GP 52 6.2.3 Grammar-based Constraints 53 6.2.4 Constraints and Bias 55 6.3 Developmental Genetic Programming 57 6.4 Strongly Typed Autoconstructive GP with PushGP 59 7 Linear and Graph Genetic Programming 61 7.1 Linear Genetic Programming 61 7.1.1 Motivations 61 7.1.2 Linear GP Representations 62 7.1.3 Linear GP Operators 64 7.2 Graph-Based Genetic Programming 65 7.2.1 Parallel Distributed GP (PDGP) 65 7.2.2 PADO 67 7.2.3 Cartesian GP 67 7.2.4 Evolving Parallel Programs using Indirect Encodings 68 8 Probabilistic Genetic Programming 8.1 Estimation of Distribution Algorithms 69 8.2 Pure EDA GP 71 8.3 Mixing Grammars and Probabilities 74 9 Multi-objective Genetic Programming 75 9.1 Combining Multiple Objectives into a Scalar Fitness Function 75 9.2 Keeping the Objectives Separate 76 9.2.1 Multi-objective Bloat and Complexity Control 77 9.2.2 Other Objectives 78 9.2.3 Non-Pareto Criteria 80 9.3 Multiple Objectives via Dynamic and Staged Fitness Functions 80 9.4 Multi-objective Optimisation via Operator Bias 81 10 Fast and Distributed Genetic Programming 83 10.1 Reducing Fitness Evaluations/Increasing their Effectiveness 83 10.2 Reducing Cost of Fitness with Caches 86 10.3 Parallel and Distributed GP are Not Equivalent 88 10.4 Running GP on Parallel Hardware 89 10.4.1 Master–slave GP 89 10.4.2 GP Running on GPUs 90 10.4.3 GP on FPGAs 92 10.4.4 Sub-machine-code GP 93 10.5 Geographically Distributed GP 93 11 GP Theory and its Applications 97 11.1 Mathematical Models 98 11.2 Search Spaces 99 11.3 Bloat 101 11.3.1 Bloat in Theory 101 11.3.2 Bloat Control in Practice 104 III Practical Genetic Programming 12 Applications 12.1 Where GP has Done Well 12.2 Curve Fitting, Data Modelling and Symbolic Regression 12.3 Human Competitive Results – the Humies 12.4 Image and Signal Processing 12.5 Financial Trading, Time Series, and Economic Modelling 12.6 Industrial Process Control 12.7 Medicine, Biology and Bioinformatics 12.8 GP to Create Searchers and Solvers – Hyper-heuristics xiii 12.9 Entertainment and Computer Games 127 12.10The Arts 127 12.11Compression 128 13 Troubleshooting GP 13.1 Is there a Bug in the Code? 13.2 Can you Trust your Results? 13.3 There are No Silver Bullets 13.4 Small Changes can have Big Effects 13.5 Big Changes can have No Effect 13.6 Study your Populations 13.7 Encourage Diversity 13.8 Embrace Approximation 13.9 Control Bloat 13.10 Checkpoint Results 13.11 Report Well 13.12 Convince your Customers 14 Conclusions Tricks of the Trade A Resources A.1 Key Books A.2 Key Journals A.3 Key International Meetings A.4 GP Implementations A.5 On-Line Resources 145 B TinyGP 151 B.1 Overview of TinyGP 151 B.2 Input Data Files for TinyGP 153 B.3 Source Code 154 B.4 Compiling and Running TinyGP 162 Bibliography 167 Inde

    GPIS: genetic programming based image segmentation with applications to biomedical object detection

    Get PDF
    Image segmentation plays a critical role in many image analysis applications. However, it is ill-defined in nature and remains one of the most intractable problems in image processing. In this thesis, we propose a genetic programming based algorithm for image segmentation (GPIS). Typically, genetic programming is a Darwinian-evolution inspired program discovery method and in the past it has been successfully used as an automatic programming tool. We make use of this property of GP to evolve efficient and accurate image segmentation programs from a pool of basic image analysis operators. In addition, we provide no a priori information about that nature of the images to the GP. The algorithm was tested on two separate medical image databases and results show the proposed GP's ability to adapt and produce short and accurate segmentation algorithms, irrespective of the database in use. We compared our results with a popular GA based image segmentation/classification system, GENIE Pro. We found that our proposed algorithm produced accurate image segmentations performed consistently on both databases and could possibly be extended to other image databases as a general-purpose image segmentation tool

    A gaussian mixture-based approach to synthesizing nonlinear feature functions for automated object detection

    Get PDF
    Feature design is an important part to identify objects of interest into a known number of categories or classes in object detection. Based on the depth-first search for higher order feature functions, the technique of automated feature synthesis is generally considered to be a process of creating more effective features from raw feature data during the run of the algorithms. This dynamic synthesis of nonlinear feature functions is a challenging problem in object detection. This thesis presents a combinatorial approach of genetic programming and the expectation maximization algorithm (GP-EM) to synthesize nonlinear feature functions automatically in order to solve the given tasks of object detection. The EM algorithm investigates the use of Gaussian mixture which is able to model the behaviour of the training samples during an optimal GP search strategy. Based on the Gaussian probability assumption, the GP-EM method is capable of performing simultaneously dynamic feature synthesis and model-based generalization. The EM part of the approach leads to the application of the maximum likelihood (ML) operation that provides protection against inter-cluster data separation and thus exhibits improved convergence. Additionally, with the GP-EM method, an innovative technique, called the histogram region of interest by thresholds (HROIBT), is introduced for diagnosing protein conformation defects (PCD) from microscopic imagery. The experimental results show that the proposed approach improves the detection accuracy and efficiency of pattern object discovery, as compared to single GP-based feature synthesis methods and also a number of other object detection systems. The GP-EM method projects the hyperspace of the raw data onto lower-dimensional spaces efficiently, resulting in faster computational classification processes

    A hybrid algorithm for Bayesian network structure learning with application to multi-label learning

    Get PDF
    We present a novel hybrid algorithm for Bayesian network structure learning, called H2PC. It first reconstructs the skeleton of a Bayesian network and then performs a Bayesian-scoring greedy hill-climbing search to orient the edges. The algorithm is based on divide-and-conquer constraint-based subroutines to learn the local structure around a target variable. We conduct two series of experimental comparisons of H2PC against Max-Min Hill-Climbing (MMHC), which is currently the most powerful state-of-the-art algorithm for Bayesian network structure learning. First, we use eight well-known Bayesian network benchmarks with various data sizes to assess the quality of the learned structure returned by the algorithms. Our extensive experiments show that H2PC outperforms MMHC in terms of goodness of fit to new data and quality of the network structure with respect to the true dependence structure of the data. Second, we investigate H2PC's ability to solve the multi-label learning problem. We provide theoretical results to characterize and identify graphically the so-called minimal label powersets that appear as irreducible factors in the joint distribution under the faithfulness condition. The multi-label learning problem is then decomposed into a series of multi-class classification problems, where each multi-class variable encodes a label powerset. H2PC is shown to compare favorably to MMHC in terms of global classification accuracy over ten multi-label data sets covering different application domains. Overall, our experiments support the conclusions that local structural learning with H2PC in the form of local neighborhood induction is a theoretically well-motivated and empirically effective learning framework that is well suited to multi-label learning. The source code (in R) of H2PC as well as all data sets used for the empirical tests are publicly available.Comment: arXiv admin note: text overlap with arXiv:1101.5184 by other author

    Data mining in soft computing framework: a survey

    Get PDF
    The present article provides a survey of the available literature on data mining using soft computing. A categorization has been provided based on the different soft computing tools and their hybridizations used, the data mining function implemented, and the preference criterion selected by the model. The utility of the different soft computing methodologies is highlighted. Generally fuzzy sets are suitable for handling the issues related to understandability of patterns, incomplete/noisy data, mixed media information and human interaction, and can provide approximate solutions faster. Neural networks are nonparametric, robust, and exhibit good learning and generalization capabilities in data-rich environments. Genetic algorithms provide efficient search algorithms to select a model, from mixed media data, based on some preference criterion/objective function. Rough sets are suitable for handling different types of uncertainty in data. Some challenges to data mining and the application of soft computing methodologies are indicated. An extensive bibliography is also included
    • …
    corecore