    Computational complexity analysis of genetic programming

    Genetic programming (GP) is an evolutionary computation technique to solve problems in an automated, domain-independent way. Rather than identifying the optimum of a function as in more traditional evolutionary optimization, the aim of GP is to evolve computer programs with a given functionality. While many GP applications have produced human competitive results, the theoretical understanding of what problem characteristics and algorithm properties allow GP to be effective is comparatively limited. Compared with traditional evolutionary algorithms for function optimization, GP applications are further complicated by two additional factors: the variable-length representation of candidate programs, and the difficulty of evaluating their quality efficiently. Such difficulties considerably impact the runtime analysis of GP, where space complexity also comes into play. As a result, initial complexity analyses of GP have focused on restricted settings such as the evolution of trees with given structures or the estimation of solution quality using only a small polynomial number of input/output examples. However, the first computational complexity analyses of GP for evolving proper functions with defined input/output behavior have recently appeared. In this chapter, we present an overview of the state of the art

    PAC learning and genetic programming

    Genetic programming (GP) is a very successful type of learning algorithm that is hard to understand from a theoretical point of view. With this paper we contribute to the computational complexity analysis of genetic programming that has been started recently. We analyze GP in the well-known PAC learning framework and point out how it can observe quality changes in the the evolution of functions by random sampling. This leads to computational complexity bounds for a linear GP algorithm for perfectly learning any member of a simple class of linear pseudo-Boolean functions. Furthermore, we show that the same algorithm on the functions from the same class finds good approximations of the target function in less time.Timo Kötzing, Frank Neumann and Reto Spöhe

    Single- and multi-objective genetic programming: new bounds for weighted order and majority

    We consolidate the existing computational complexity analysis of genetic programming (GP) by bringing together sound theoretical proofs and empirical analysis. In particular, we address computational complexity issues arising when coupling algorithms using variable length representation, such as GP itself, with different bloat-control techniques. In order to accomplish this, we first introduce several novel upper bounds for two single- and multi-objective GP algorithms on the generalised Weighted ORDER and MAJORITY problems. To obtain these, we employ well-established computational complexity analysis techniques such as fitness-based partitions, and for the first time, additive and multiplicative drift. The bounds we identify depend on two measures, the maximum tree size and the maximum population size, that arise during the optimization run and that have a key relevance in determining the runtime of the studied GP algorithms. In order to understand the impact of these measures on a typical run, we study their magnitude experimentally, and we discuss the obtained findings.Anh Nguyen, Tommaso Urli, Markus Wagnerhttp://www.sigevo.org/foga-2013

    Examining applying high performance genetic data feature selection and classification algorithms for colon cancer diagnosis

    Background and Objectives: This paper examines the accuracy and efficiency (time complexity) of high performance genetic data feature selection and classification algorithms for colon cancer diagnosis. The need for this research derives from the urgent and increasing need for accurate and efficient algorithms. Colon cancer is a leading cause of death worldwide, hence it is vitally important for the cancer tissues to be expertly identified and classified in a rapid and timely manner, to assure both a fast detection of the disease and to expedite the drug discovery process. Methods: In this research, a three-phase approach was proposed and implemented: Phases One and Two examined the feature selection algorithms and classification algorithms employed separately, and Phase Three examined the performance of the combination of these. Results: It was found from Phase One that the Particle Swarm Optimization (PSO) algorithm performed best with the colon dataset as a feature selection (29 genes selected) and from Phase Two that the Sup- port Vector Machine (SVM) algorithm outperformed other classifications, with an accuracy of almost 86%. It was also found from Phase Three that the combined use of PSO and SVM surpassed other algorithms in accuracy and performance, and was faster in terms of time analysis (94%). Conclusions: It is concluded that applying feature selection algorithms prior to classification algorithms results in better accuracy than when the latter are applied alone. This conclusion is important and significant to industry and society

    Computational Complexity Analysis of Genetic Programming - Initial Results and Future Directions

    Genetic and Evolutionary Computation SeriesThe computational complexity analysis of evolutionary algorithmsworking on binary strings has significantly increased the rigorous understanding on how these types of algorithm work. Similar results on the computational complexity of genetic programming would fill an important theoretic gap. They would significantly increase the theoretical understanding on how and why genetic programming algorithms work and indicate, in a rigorous manner, how design choices of algorithm components impact its success. We summarize initial computational complexity results for simple tree-based genetic programming and point out directions for future research.Frank Neumann, Una-May O’Reilly and Markus Wagne

    Experimental supplements to the computational complexity analysis of genetic programming for problems modelling isolated program semantics

    In this paper, we carry out experimental investigations that complement recent theoretical investigations on the runtime of simple genetic programming algorithms [3, 7]. Crucial measures in these theoretical analyses are the maximum tree size that is attained during the run of the algorithms as well as the population size when dealing with multi-objective models. We study those measures in detail by experimental investigations and analyze the runtime of the different algorithms in an experimental way.Tommaso Urli, Markus Wagner and Frank Neuman

    Computational complexity analysis of simple genetic programming on two problems modeling isolated program semantics

    Analyzing the computational complexity of evolutionary algorithms (EAs) for binary search spaces has significantly informed our understanding of EAs in general. With this paper, we start the computational complexity analysis of genetic programming (GP). We set up several simplified GP algorithms and analyze them on two separable model problems, ORDER and MAJORITY, each of which captures a relevant facet of typical GP problems. Both analyses give first rigorous insights into aspects of GP design, highlighting in particular the impact of accepting or rejecting neutral moves and the importance of a local mutation operator.Greg Durrett, Frank Neumann, Una-May O’Reillyhttp://www.sigevo.org/foga-2011

    Computational complexity analysis of multi-objective genetic programming

    The computational complexity analysis of genetic programming (GP) has been started recently in [7] by analyzing simple (1+1) GP algorithms for the problems ORDER and MAJORITY. In this paper, we study how taking the complexity as an additional criteria inuences the runtime behavior. We consider generalizations of ORDER and MAJORITY and present a computational complexity analysis of (1+1) GP using multi-criteria fitness functions that take into account the original objective and the complexity of a syntax tree as a secondary measure. Furthermore, we study the expected time until population-based multi-objective genetic programming algorithms have computed the Pareto front when taking the complexity of a syntax tree as an equally important objective.Frank Neuman