29,921 research outputs found

    Semantic variation operators for multidimensional genetic programming

    Full text link
    Multidimensional genetic programming represents candidate solutions as sets of programs, and thereby provides an interesting framework for exploiting building block identification. Towards this goal, we investigate the use of machine learning as a way to bias which components of programs are promoted, and propose two semantic operators to choose where useful building blocks are placed during crossover. A forward stagewise crossover operator we propose leads to significant improvements on a set of regression problems, and produces state-of-the-art results in a large benchmark study. We discuss this architecture and others in terms of their propensity for allowing heuristic search to utilize information during the evolutionary process. Finally, we look at the collinearity and complexity of the data representations that result from these architectures, with a view towards disentangling factors of variation in application.Comment: 9 pages, 8 figures, GECCO 201

    A multi-population hybrid Genetic Programming System

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsIn the last few years, geometric semantic genetic programming has incremented its popularity, obtaining interesting results on several real life applications. Nevertheless, the large size of the solutions generated by geometric semantic genetic programming is still an issue, in particular for those applications in which reading and interpreting the final solution is desirable. In this thesis, a new parallel and distributed genetic programming system is introduced with the objective of mitigating this drawback. The proposed system (called MPHGP, which stands for Multi-Population Hybrid Genetic Programming) is composed by two types of subpopulations, one of which runs geometric semantic genetic programming, while the other runs a standard multi-objective genetic programming algorithm that optimizes, at the same time, fitness and size of solutions. The two subpopulations evolve independently and in parallel, exchanging individuals at prefixed synchronization instants. The presented experimental results, obtained on five real-life symbolic regression applications, suggest that MPHGP is able to find solutions that are comparable, or even better, than the ones found by geometric semantic genetic programming, both on training and on unseen testing data. At the same time, MPHGP is also able to find solutions that are significantly smaller than the ones found by geometric semantic genetic programming

    How Noisy Data Affects Geometric Semantic Genetic Programming

    Full text link
    Noise is a consequence of acquiring and pre-processing data from the environment, and shows fluctuations from different sources---e.g., from sensors, signal processing technology or even human error. As a machine learning technique, Genetic Programming (GP) is not immune to this problem, which the field has frequently addressed. Recently, Geometric Semantic Genetic Programming (GSGP), a semantic-aware branch of GP, has shown robustness and high generalization capability. Researchers believe these characteristics may be associated with a lower sensibility to noisy data. However, there is no systematic study on this matter. This paper performs a deep analysis of the GSGP performance over the presence of noise. Using 15 synthetic datasets where noise can be controlled, we added different ratios of noise to the data and compared the results obtained with those of a canonical GP. The results show that, as we increase the percentage of noisy instances, the generalization performance degradation is more pronounced in GSGP than GP. However, in general, GSGP is more robust to noise than GP in the presence of up to 10% of noise, and presents no statistical difference for values higher than that in the test bed.Comment: 8 pages, In proceedings of Genetic and Evolutionary Computation Conference (GECCO 2017), Berlin, German

    Sequential Symbolic Regression with Genetic Programming

    Get PDF
    This chapter describes the Sequential Symbolic Regression (SSR) method, a new strategy for function approximation in symbolic regression. The SSR method is inspired by the sequential covering strategy from machine learning, but instead of sequentially reducing the size of the problem being solved, it sequentially transforms the original problem into potentially simpler problems. This transformation is performed according to the semantic distances between the desired and obtained outputs and a geometric semantic operator. The rationale behind SSR is that, after generating a suboptimal function f via symbolic regression, the output errors can be approximated by another function in a subsequent iteration. The method was tested in eight polynomial functions, and compared with canonical genetic programming (GP) and geometric semantic genetic programming (SGP). Results showed that SSR significantly outperforms SGP and presents no statistical difference to GP. More importantly, they show the potential of the proposed strategy: an effective way of applying geometric semantic operators to combine different (partial) solutions, avoiding the exponential growth problem arising from the use of these operators

    A multiple expression alignment framework for genetic programming

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsAlignment in the error space is a recent idea to exploit semantic awareness in genetic programming. In a previous contribution, the concepts of optimally aligned and optimally coplanar individuals were introduced, and it was shown that given optimally aligned, or optimally coplanar, individuals, it is possible to construct a globally optimal solution analytically. Consequently, genetic programming methods, aimed at searching for optimally aligned, or optimally coplanar, individuals were introduced. This paper critically discusses those methods, analyzing their major limitations and introduces a new genetic programming system aimed at overcoming those limitations. The presented experimental results, conducted on five real-life symbolic regression problems, show that the proposed algorithms’ outperform not only the existing methods based on the concept of alignment in the error space, but also geometric semantic genetic programming and standard genetic programming

    Repeated patterns in tree genetic programming

    Get PDF
    We extend our analysis of repetitive patterns found in genetic programming genomes to tree based GP. As in linear GP, repetitive patterns are present in large numbers. Size fair crossover limits bloat in automatic programming, preventing the evolution of recurring motifs. We examine these complex properties in detail: e.g. using depth v. size Catalan binary tree shape plots, subgraph and subtree matching, information entropy, syntactic and semantic fitness correlations and diffuse introns. We relate this emergent phenomenon to considerations about building blocks in GP and how GP works

    Genetic programming with semantic equivalence classes

    Get PDF
    Ruberto, S., Vanneschi, L., & Castelli, M. (2019). Genetic programming with semantic equivalence classes. Swarm and Evolutionary Computation, 44(February), 453-469. DOI: 10.1016/j.swevo.2018.06.001In this paper, we introduce the concept of semantics-based equivalence classes for symbolic regression problems in genetic programming. The idea is implemented by means of two different genetic programming systems, in which two different definitions of equivalence are used. In both systems, whenever a solution in an equivalence class is found, it is possible to generate any other solution in that equivalence class analytically. As such, these two systems allow us to shift the objective of genetic programming: instead of finding a globally optimal solution, the objective is now to find any solution that belongs to the same equivalence class as a global optimum. Further, we propose improvements to these genetic programming systems in which, once a solution that belongs to a particular equivalence class is generated, no other solution in that class is accepted in the population during the evolution anymore. We call these improved versions filtered systems. Experimental results obtained via seven complex real-life test problems show that using equivalence classes is a promising idea and that filters are generally helpful for improving the systems' performance. Furthermore, the proposed methods produce individuals with a much smaller size with respect to geometric semantic genetic programming. Finally, we show that filters are also useful to improve the performance of a state-of-the-art method, not explicitly based on semantic equivalence classes, like linear scaling.authorsversionpublishe
    corecore