76 research outputs found

    It is not the length that matters, it is how you control it

    Get PDF
    Abstract-The length of test cases is a little investigated topic in search-based test generation for object oriented software, where test cases are sequences of method calls. While intuitively longer tests can achieve higher overall code coverage, there is always the threat of bloat -a complex phenomenon in evolutionary computation, where the length abnormally grows over time. In this paper, we show that bloat indeed also occurs in the context of test generation for object oriented software. We present different techniques to overcome the problem of length bloat, and evaluate all possible combinations of these techniques using different search lengths. Experiments on a set of difficult search targets selected from several open source and industrial projects show that the important choice in search-based testing is not the length of test cases, but how to make sure that this length does not become bloated

    Neutral Networks of Real-World Programs and their Application to Automated Software Evolution

    Get PDF
    The existing software development ecosystem is the product of evolutionary forces, and consequently real-world software is amenable to improvement through automated evolutionary techniques. This dissertation presents empirical evidence that software is inherently robust to small randomized program transformations, or \u27mutations. Simple and general mutation operations are demonstrated that can be applied to software source code, compiled assembler code, or directly to binary executables. These mutations often generate variants of working programs that differ significantly from the original, yet remain fully functional. Applying successive mutations to the same software program uncovers large \u27neutral networks\u27 of fully functional variants of real-world software projects. These properties of \u27mutational robustness\u27 and the corresponding \u27neutral networks\u27 have been studied extensively in biology and are believed to be related to the capacity for unsupervised evolution and adaptation. As in biological systems, mutational robustness and neutral networks in software systems enable automated evolution. The dissertation presents several applications that leverage software neutral networks to automate common software development and maintenance tasks. Neutral networks are explored to generate diverse implementations of software for improving runtime security and for proactively repairing latent bugs. Next, a technique is introduced for automatically repairing bugs in the assembler and executables compiled from off-the-shelf software. As demonstration, a proprietary executable is manipulated to patch security vulnerabilities without access to source code or any aid from the software vendor. Finally, software neutral networks are leveraged to optimize complex nonfunctional runtime properties. This optimization technique is used to reduce the energy consumption of the popular PARSEC benchmark applications by 20% as compared to the best available public domain compiler optimizations. The applications presented herein apply evolutionary computation techniques to existing software using common software engineering tools. By enabling evolutionary techniques within the existing software development toolchain, this work is more likely to be of practical benefit to the developers and maintainers of real-world software systems

    Automated development of clinical prediction models using genetic programming

    Get PDF
    Genetic programming is an Evolutionary Computing technique, inspired by biological evolution, capable of discovering complex non-linear patterns in large datasets. Genetic programming is a general methodology, the specific implementation of which requires development of several different specific elements such as problem representation, fitness, selection and genetic variation. Despite the potential advantages of genetic programming over standard statistical methods, its applications to survival analysis are at best rare, primarily because of the difficulty in handling censored data. The aim of this work was to develop a genetic programming approach for survival analysis and demonstrate its utility for the automatic development of clinical prediction models using cardiovascular disease as a case study. We developed a tree-based untyped steady-state genetic programming approach for censored longitudinal data, comparing its performance to the de facto statistical method—Cox regression—in the development of clinical prediction models for the prediction of future cardiovascular events in patients with symptomatic and asymptomatic cardiovascular disease, using large observational datasets. We also used genetic programming to examine the prognostic significance of different risk factors together with their non-linear combinations for the prognosis of health outcomes in cardiovascular disease. These experiments showed that Cox regression and the developed steady-state genetic programming approach produced similar results when evaluated in common validation datasets. Despite slight relative differences, both approaches demonstrated an acceptable level of discriminative and calibration at a range of times points. Whilst the application of genetic programming did not provide more accurate representations of factors that predict the risk of both symptomatic and asymptomatic cardiovascular disease when compared with existing methods, genetic programming did offer comparable performance. Despite generally comparable performance, albeit in slight favour of the Cox model, the predictors selected for representing their relationships with the outcome were quite different and, on average, the models developed using genetic programming used considerably fewer predictors. The results of the genetic programming confirm the prognostic significance of a small number of the most highly associated predictors in the Cox modelling; age, previous atherosclerosis, and albumin for secondary prevention; age, recorded diagnosis of ’other’ cardiovascular disease, and ethnicity for primary prevention in patients with type 2 diabetes. When considered as a whole, genetic programming did not produce better performing clinical prediction models, rather it utilised fewer predictors, most of which were the predictors that Cox regression estimated be most strongly associated with the outcome, whilst achieving comparable performance. This suggests that genetic programming may better represent the potentially non-linear relationship of (a smaller subset of) the strongest predictors. To our knowledge, this work is the first study to develop a genetic programming approach for censored longitudinal data and assess its value for clinical prediction in comparison with the well-known and widely applied Cox regression technique. Using empirical data this work has demonstrated that clinical prediction models developed by steady-state genetic programming have predictive ability comparable to those developed using Cox regression. The genetic programming models were more complex and thus more difficult to validate by domain experts, however these models were developed in an automated fashion, using fewer input variables, without the need for domain specific knowledge and expertise required to appropriately perform survival analysis. This work has demonstrated the strong potential of genetic programming as a methodology for automated development of clinical prediction models for diagnostic and prognostic purposes in the presence of censored data. This work compared untuned genetic programming models that were developed in an automated fashion with highly tuned Cox regression models that was developed in a very involved manner that required a certain amount of clinical and statistical expertise. Whilst the highly tuned Cox regression models performed slightly better in validation data, the performance of the automatically generated genetic programming models were generally comparable. The comparable performance demonstrates the utility of genetic programming for clinical prediction modelling and prognostic research, where the primary goal is accurate prediction. In aetiological research, where the primary goal is to examine the relative strength of association between risk factors and the outcome, then Cox regression and its variants remain as the de facto approach

    Genetic programming for the RoboCup Rescue Simulation System

    Get PDF
    The Robocup Rescue Simulation System (RCRSS) is a dynamic system of multi-agent interaction, simulating a large-scale urban disaster scenario. Teams of rescue agents are charged with the tasks of minimizing civilian casualties and infrastructure damage while competing against limitations on time, communication, and awareness. This thesis provides the first known attempt of applying Genetic Programming (GP) to the development of behaviours necessary to perform well in the RCRSS. Specifically, this thesis studies the suitability of GP to evolve the operational behaviours required of each type of rescue agent in the RCRSS. The system developed is evaluated in terms of the consistency with which expected solutions are the target of convergence as well as by comparison to previous competition results. The results indicate that GP is capable of converging to some forms of expected behaviour, but that additional evolution in strategizing behaviours must be performed in order to become competitive. An enhancement to the standard GP algorithm is proposed which is shown to simplify the initial search space allowing evolution to occur much quicker. In addition, two forms of population are employed and compared in terms of their apparent effects on the evolution of control structures for intelligent rescue agents. The first is a single population in which each individual is comprised of three distinct trees for the respective control of three types of agents, the second is a set of three co-evolving subpopulations one for each type of agent. Multiple populations of cooperating individuals appear to achieve higher proficiencies in training, but testing on unseen instances raises the issue of overfitting

    Testing market imperfections via genetic programming

    Get PDF
    The thesis checks the validity of the efficient markets hypothesis focusing on stock markets. Technical trading rules are generated by using an evolutionary optimization algorithm (Genetic Programming) based on training samples. The trading rules are subsequently applied to data samples unknown to the algorithm beforehand. The benchmark strategy consists of a classic buy-and-hold strategy in the DAX and the Hang Seng. The trading rules generally fail at consistently beating the benchmark thus indicating that market efficiency holds.Gegenstand der Dissertation ist die Überprüfung von Markteffizienz auf Aktienmärkten. Hierzu werden technische Handelsregeln mit Hilfe eines evolutionären Optimierungsalgorithmus (Genetic Programming) anhand von Trainingsdaten erlernt und anschließend auf eine unbekannte Zeitreihe angewandt. Als Benchmark dient eine klassische buy-and-hold Strategie im DAX und Hang Seng. Es zeigt sich, dass die mittels Genetic Programming generierten Handelsstrategien den Benchmark auf risikoadjustierter Basis nicht durchgängig schlagen können und somit die These effizienter Märkte für den DAX und den Hang Seng gültig ist
    • …
    corecore