202 research outputs found

    How Noisy Data Affects Geometric Semantic Genetic Programming

    Full text link
    Noise is a consequence of acquiring and pre-processing data from the environment, and shows fluctuations from different sources---e.g., from sensors, signal processing technology or even human error. As a machine learning technique, Genetic Programming (GP) is not immune to this problem, which the field has frequently addressed. Recently, Geometric Semantic Genetic Programming (GSGP), a semantic-aware branch of GP, has shown robustness and high generalization capability. Researchers believe these characteristics may be associated with a lower sensibility to noisy data. However, there is no systematic study on this matter. This paper performs a deep analysis of the GSGP performance over the presence of noise. Using 15 synthetic datasets where noise can be controlled, we added different ratios of noise to the data and compared the results obtained with those of a canonical GP. The results show that, as we increase the percentage of noisy instances, the generalization performance degradation is more pronounced in GSGP than GP. However, in general, GSGP is more robust to noise than GP in the presence of up to 10% of noise, and presents no statistical difference for values higher than that in the test bed.Comment: 8 pages, In proceedings of Genetic and Evolutionary Computation Conference (GECCO 2017), Berlin, German

    Cellular geometric semantic genetic programming

    Get PDF
    Among the different variants of Genetic Programming (GP), Geometric Semantic GP (GSGP) has proved to be both efficient and effective in finding good solutions. The fact that the operators of GSGP operate on the semantics of the individuals in a clear way provides guarantees on the way the search is performed. GSGP is not, however, free from limitations like the premature convergence of the population to a small - and possibly sub-optimal-area of the search space. One reason for this issue could be the fact that good individuals can quickly "spread" in the population suppressing the emergence of competition. To mitigate this problem, we impose a cellular automata (CA) inspired communication topology over GSGP. In CAs a collection of agents (as finite state automata) are positioned in a n-dimensional periodic grid and communicates only locally with the automata in their neighbourhoods. Similarly, we assign a location to each individual on an n-dimensional grid and the entire evolution for an individual will happen locally by considering, for each individual, only the individuals in its neighbourhood. Specifically, we present an algorithm in which, for each generation, a subset of the neighbourhood of each individual is sampled and the selection for the given cell in the grid is performed by extracting the two best individuals of this subset, which are employed as parents for the Geometric Semantic Crossover. We compare this cellular GSGP (cGSGP) approach with standard GSGP on eight regression problems, showing that it can provide better solutions than GSGP. Moreover, by analyzing convergence rates, we show that the improvement is observable regardless of the number of executed generations. As a side effect, we additionally show that combining a small-neighbourhood-based cellular spatial structure with GSGP helps in producing smaller solutions. Finally, we measure the spatial autocorrelation of the population by adopting the Moran's I coefficient to provide an overview of the diversity, showing that our cellular spatial structure helps in providing better diversity during the early stages of the evolutio

    A multi-population hybrid Genetic Programming System

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsIn the last few years, geometric semantic genetic programming has incremented its popularity, obtaining interesting results on several real life applications. Nevertheless, the large size of the solutions generated by geometric semantic genetic programming is still an issue, in particular for those applications in which reading and interpreting the final solution is desirable. In this thesis, a new parallel and distributed genetic programming system is introduced with the objective of mitigating this drawback. The proposed system (called MPHGP, which stands for Multi-Population Hybrid Genetic Programming) is composed by two types of subpopulations, one of which runs geometric semantic genetic programming, while the other runs a standard multi-objective genetic programming algorithm that optimizes, at the same time, fitness and size of solutions. The two subpopulations evolve independently and in parallel, exchanging individuals at prefixed synchronization instants. The presented experimental results, obtained on five real-life symbolic regression applications, suggest that MPHGP is able to find solutions that are comparable, or even better, than the ones found by geometric semantic genetic programming, both on training and on unseen testing data. At the same time, MPHGP is also able to find solutions that are significantly smaller than the ones found by geometric semantic genetic programming

    Geometric Semantic Genetic Programming

    Get PDF
    Traditional Genetic Programming (GP) searches the space of functions/programs by using search operators that manipulate their syntactic representation, regardless of their actual semantics/behaviour. Recently, semantically aware search operators have been shown to outperform purely syntactic operators. In this work, using a formal geometric view on search operators and representations, we bring the semantic approach to its extreme consequences and introduce a novel form of GP – Geometric Semantic GP (GSGP) – that searches directly the space of the underlying semantics of the programs. This perspective provides new insights on the relation between program syntax and semantics, search operators and fitness landscape, and allows for principled formal design of semantic search operators for different classes of problems. We de- rive specific forms of GSGP for a number of classic GP domains and experimentally demonstrate their superiority to conventional operators

    Geometric Semantic Genetic Programming

    Get PDF
    Traditional Genetic Programming (GP) searches the space of functions/programs by using search operators that manipulate their syntactic representation, regardless of their actual semantics/behaviour. Recently, semantically aware search operators have been shown to outperform purely syntactic operators. In this work, using a formal geometric view on search operators and representations, we bring the semantic approach to its extreme consequences and introduce a novel form of GP – Geometric Semantic GP (GSGP) – that searches directly the space of the underlying semantics of the programs. This perspective provides new insights on the relation between program syntax and semantics, search operators and fitness landscape, and allows for principled formal design of semantic search operators for different classes of problems. We de- rive specific forms of GSGP for a number of classic GP domains and experimentally demonstrate their superiority to conventional operators

    Geometric Semantic Genetic Programming

    Get PDF
    Tato práce se zabývá převodem řešení získaného geometrickým sémantickým genetickým programováním (GSGP) na instanci kartézského genetického programování (CGP). GSGP se ukázalo jakožto kvalitní při tvorbě složitých matematických modelů, ale problémem je výsledná velikost řešení. CGP zase dokáže dobře redukovat velikost již vzniklých řešení. Tato práce dala pomocí kombinací těchto dvou metod vzniknout podstromovému CGP (SCGP), které jako vstup používá výstup GSGP a evoluci pak provádí pomocí CGP. Experimenty provedené na čtyřech úlohách z oblasti farmakokinetiky ukázaly, že SCGP dokáže vždy zmenšit řešení a ve třech ze čtyř případů navíc úspěšně bez přetrénování.This thesis examines a conversion of a solution produced by geometric semantic genetic programming (GSGP) to an instantion of cartesian genetic programming (CGP). GSGP has proven its quality to create complex mathematical models; however, the size of these models can get problematically large. CGP, on the other hand, is able to reduce the size of given models. This thesis combinated these methods to create a subtree CGP (SCGP). The SCGP uses an output of GSGP as an input and the evolution is performed using the CGP. Experiments performed on four pharmacokinetic tasks have shown that the SCGP is able to reduce the solution size in every case. Overfitting was detected in one out of four test problems.

    Geometric semantic genetic programming for recursive boolean programs

    Get PDF
    This is the author accepted manuscript. The final version is available from ACM via the DOI in this record.Geometric Semantic Genetic Programming (GSGP) induces a unimodal fitness landscape for any problem that consists in finding a function fitting given input/output examples. Most of the work around GSGP to date has focused on real-world applications and on improving the originally proposed search operators, rather than on broadening its theoretical framework to new domains. We extend GSGP to recursive programs, a notoriously challenging domain with highly discontinuous fitness landscapes. We focus on programs that map variable-length Boolean lists to Boolean values, and design search operators that are provably efficient in the training phase and attain perfect generalization. Computational experiments complement the theory and demonstrate the superiority of the new operators to the conventional ones. This work provides new insights into the relations between program syntax and semantics, search operators and fitness landscapes, also for more general recursive domains.© 2017 Copyright held by the owner/author(s). Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    A Dispersion Operator for Geometric Semantic Genetic Programming

    Get PDF
    Recent advances in geometric semantic genetic programming (GSGP) have shown that the results obtained by these methods can outperform those obtained by classical genetic programming algorithms, in particular in the context of symbolic regression. However, there are still many open issues on how to improve their search mechanism. One of these issues is how to get around the fact that the GSGP crossover operator cannot generate solutions that are placed outside the convex hull formed by the individuals of the current population. Although the mutation operator alleviates this problem, we cannot guarantee it will find promising regions of the search space within feasible computational time. In this direction, this paper proposes a new geometric dispersion operator that uses multiplicative factors to move individuals to less dense areas of the search space around the target solution before applying semantic genetic operators. Experiments in sixteen datasets show that the results obtained by the proposed operator are statistically significantly better than those produced by GSGP and that the operator does indeed spread the solutions around the target solution
    • …
    corecore