7,870 research outputs found

    Evolutionary improvement of programs

    Get PDF
    Most applications of genetic programming (GP) involve the creation of an entirely new function, program or expression to solve a specific problem. In this paper, we propose a new approach that applies GP to improve existing software by optimizing its non-functional properties such as execution time, memory usage, or power consumption. In general, satisfying non-functional requirements is a difficult task and often achieved in part by optimizing compilers. However, modern compilers are in general not always able to produce semantically equivalent alternatives that optimize non-functional properties, even if such alternatives are known to exist: this is usually due to the limited local nature of such optimizations. In this paper, we discuss how best to combine and extend the existing evolutionary methods of GP, multiobjective optimization, and coevolution in order to improve existing software. Given as input the implementation of a function, we attempt to evolve a semantically equivalent version, in this case optimized to reduce execution time subject to a given probability distribution of inputs. We demonstrate that our framework is able to produce non-obvious optimizations that compilers are not yet able to generate on eight example functions. We employ a coevolved population of test cases to encourage the preservation of the function's semantics. We exploit the original program both through seeding of the population in order to focus the search, and as an oracle for testing purposes. As well as discussing the issues that arise when attempting to improve software, we employ rigorous experimental method to provide interesting and practical insights to suggest how to address these issues

    A Memetic Algorithm for whole test suite generation

    Get PDF
    The generation of unit-level test cases for structural code coverage is a task well-suited to Genetic Algorithms. Method call sequences must be created that construct objects, put them into the right state and then execute uncovered code. However, the generation of primitive values, such as integers and doubles, characters that appear in strings, and arrays of primitive values, are not so straightforward. Often, small local changes are required to drive the value toward the one needed to execute some target structure. However, global searches like Genetic Algorithms tend to make larger changes that are not concentrated on any particular aspect of a test case. In this paper, we extend the Genetic Algorithm behind the EvoSuiTE test generation tool into a Memetic Algorithm, by equipping it with several local search operators. These operators are designed to efficiently optimize primitive values and other aspects of a test suite that allow the search for test cases to function more effectively. We evaluate our operators using a rigorous experimental methodology on over 12,000 Java classes, comprising open source classes of various different kinds, including numerical applications and text processors. Our study shows that increases in branch coverage of up to 53% are possible for an individual class in practice

    Optimization Techniques for Automated Software Test Data Generation

    Get PDF
    Esta tesis propone una variedad de contribuciones al campo de pruebas evolutivas. Hemos abarcados un amplio rango de aspectos relativos a las pruebas de programas: código fuente procedimental y orientado a objetos, paradigmas estructural y funcional, problemas mono-objetivo y multi-objetivo, casos de prueba aislados y secuencias de pruebas, y trabajos teóricos y experimentales. En relación a los análisis llevados a cabo, hemos puesto énfasis en el análisis estadístico de los resultados para evaluar la significancia práctica de los resultados. En resumen, las principales contribuciones de la tesis son: Definición de una nueva medida de distancia para el operador instanceof en programas orientados a objetos: En este trabajo nos hemos centrado en un aspecto relacionado con el software orientado a objetos, la herencia, para proponer algunos enfoques que pueden ayudar a guiar la búsqueda de datos de prueba en el contexto de las pruebas evolutivas. En particular, hemos propuesto una medida de distancia para computar la distancia de ramas en presencia del operador instanceof en programas Java. También hemos propuesto dos operadores de mutación que modifican las soluciones candidatas basadas en la medida de distancia definida. Definición de una nueva medida de complejidad llamada ``Branch Coverage Expectation'': En este trabajo nos enfrentamos a la complejidad de pruebas desde un punto de vista original: un programa es más complejo si es más difícil de probar de forma automática. Consecuentemente, definimos la ``Branch Coverage Expectation'' para proporcionar conocimiento sobre la dificultad de probar programas. La fundación de esta medida se basa en el modelo de Markov del programa. El modelo de Markov proporciona fundamentos teóricos. El análisis de esta medida indica que está más correlacionada con la cobertura de rama que las otras medidas de código estáticas. Esto significa que esto es un buen modo de estimar la dificultad de probar un programa. Predicción teórica del número de casos de prueba necesarios para cubrir un porcentaje concreto de un programa: Nuestro modelo de Markov del programa puede ser usado para proporcionar una estimación del número de casos de prueba necesarios para cubrir un porcentaje concreto del programa. Hemos comparado nuestra predicción teórica con la media de las ejecuciones reales de un generador de datos de prueba. Este modelo puede ayudar a predecir la evolución de la fase de pruebas, la cual consecuentemente puede ahorrar tiempo y coste del proyecto completo. Esta predicción teórica podría ser también muy útil para determinar el porcentaje del programa cubierto dados un número de casos de prueba. Propuesta de enfoques para resolver el problema de generación de datos de prueba multi-objetivo: En ese capítulo estudiamos el problema de la generación multi-objetivo con el fin de analizar el rendimiento de un enfoque directo multi-objetivo frente a la aplicación de un algoritmo mono-objetivo seguido de una selección de casos de prueba. Hemos evaluado cuatro algoritmos multi-objetivo (MOCell, NSGA-II, SPEA2, y PAES) y dos algoritmos mono-objetivo (GA y ES), y dos algoritmos aleatorios. En términos de convergencia hacía el frente de Pareto óptimo, GA y MOCell han sido los mejores resolutores en nuestra comparación. Queremos destacar que el enfoque mono-objetivo, donde se ataca cada rama por separado, es más efectivo cuando el programa tiene un grado de anidamiento alto. Comparativa de diferentes estrategias de priorización en líneas de productos y árboles de clasificación: En el contexto de pruebas funcionales hemos tratado el tema de la priorización de casos de prueba con dos representaciones diferentes, modelos de características que representan líneas de productos software y árboles de clasificación. Hemos comparado cinco enfoques relativos al método de clasificación con árboles y dos relativos a líneas de productos, cuatro de ellos propuestos por nosotros. Los resultados nos indican que las propuestas para ambas representaciones basadas en un algoritmo genético son mejores que el resto en la mayoría de escenarios experimentales, es la mejor opción cuando tenemos restricciones de tiempo o coste. Definición de la extensión del método de clasificación con árbol para la generación de secuencias de pruebas: Hemos definido formalmente esta extensión para la generación de secuencias de pruebas que puede ser útil para la industria y para la comunidad investigadora. Sus beneficios son claros ya que indudablemente el coste de situar el artefacto bajo pruebas en el siguiente estado no es necesario, a la vez que reducimos significativamente el tamaño de la secuencia utilizando técnicas metaheurísticas. Particularmente nuestra propuesta basada en colonias de hormigas es el mejor algoritmo de la comparativa, siendo el único algoritmo que alcanza la cobertura máxima para todos los modelos y tipos de cobertura. Exploración del efecto de diferentes estrategias de seeding en el cálculo de frentes de Pareto óptimos en líneas de productos: Estudiamos el comportamiento de algoritmos clásicos multi-objetivo evolutivos aplicados a las pruebas por pares de líneas de productos. El grupo de algoritmos fue seleccionado para cubrir una amplia y diversa gama de técnicas. Nuestra evaluación indica claramente que las estrategias de seeding ayudan al proceso de búsqueda de forma determinante. Cuanta más información se disponga para crear esta población inicial, mejores serán los resultados obtenidos. Además, gracias al uso de técnicas multi-objetivo podemos proporcionar un conjunto de pruebas adecuado mayor o menor, en resumen, que mejor se adapte a sus restricciones económicas o tecnológicas. Propuesta de técnica exacta para la computación del frente de Pareto óptimo en líneas de productos software: Hemos propuesto un enfoque exacto para este cálculo en el caso multi-objetivo con cobertura paiwise. Definimos un programa lineal 0-1 y un algoritmo basado en resolutores SAT para obtener el frente de Pareto verdadero. La evaluación de los resultados nos indica que, a pesar de ser un fantástico método para el cálculo de soluciones óptimas, tiene el inconveniente de la escalabilidad, ya que para modelos grandes el tiempo de ejecución sube considerablemente. Tras realizar un estudio de correlaciones, confirmamos nuestras sospechas, existe una alta correlación entre el tiempo de ejecución y el número de productos denotado por el modelo de características del programa

    Evolutionary Search Techniques with Strong Heuristics for Multi-Objective Feature Selection in Software Product Lines

    Get PDF
    Software design is a process of trading off competing objectives. If the user objective space is rich, then we should use optimizers that can fully exploit that richness. For example, this study configures software product lines (expressed as feature models) using various search-based software engineering methods. Our main result is that as we increase the number of optimization objectives, the methods in widespread use (e.g. NSGA-II, SPEA2) perform much worse than IBEA (Indicator-Based Evolutionary Algorithm). IBEA works best since it makes most use of user preference knowledge. Hence it does better on the standard measures (hypervolume and spread) but it also generates far more products with 0 violations of domain constraints. We also present significant improvements to IBEA\u27s performance by employing three strong heuristic techniques that we call PUSH, PULL, and seeding. The PUSH technique forces the evolutionary search to respect certain rules and dependencies defined by the feature models, while the PULL technique gives higher weight to constraint satisfaction as an optimization objective and thus achieves a higher percentage of fully-compliant configurations within shorter runtimes. The seeding technique helps in guiding very large feature models to correct configurations very early in the optimization process. Our conclusion is that the methods we apply in search-based software engineering need to be carefully chosen, particularly when studying complex decision spaces with many optimization objectives. Also, we conclude that search methods must be customized to fit the problem at hand. Specifically, the evolutionary search must respect domain constraints

    Automated Unit Testing of Evolving Software

    Get PDF
    As software programs evolve, developers need to ensure that new changes do not affect the originally intended functionality of the program. To increase their confidence, developers commonly write unit tests along with the program, and execute them after a change is made. However, manually writing these unit-tests is difficult and time-consuming, and as their number increases, so does the cost of executing and maintaining them. Automated test generation techniques have been proposed in the literature to assist developers in the endeavour of writing these tests. However, it remains an open question how well these tools can help with fault finding in practice, and maintaining these automatically generated tests may require extra effort compared to human written ones. This thesis evaluates the effectiveness of a number of existing automatic unit test generation techniques at detecting real faults, and explores how these techniques can be improved. In particular, we present a novel multi-objective search-based approach for generating tests that reveal changes across two versions of a program. We then investigate whether these tests can be used such that no maintenance effort is necessary. Our results show that overall, state-of-the-art test generation tools can indeed be effective at detecting real faults: collectively, the tools revealed more than half of the bugs we studied. We also show that our proposed alternative technique that is better suited to the problem of revealing changes, can detect more faults, and does so more frequently. However, we also find that for a majority of object-oriented programs, even a random search can achieve good results. Finally, we show that such change-revealing tests can be generated on demand in practice, without requiring them to be maintained over time

    Sapienz: Multi-objective automated testing for android applications

    Get PDF
    We introduce Sapienz, an approach to Android testing that uses multi-objective search-based testing to automatically explore and optimise test sequences, minimising length, while simultaneously maximising coverage and fault revelation. Sapienz combines random fuzzing, systematic and search-based exploration, exploiting seeding and multi-level instrumentation. Sapienz significantly outperforms (with large effect size) both the state-of-the-art technique Dynodroid and the widely-used tool, Android Monkey, in 7/10 experiments for coverage, 7/10 for fault detection and 10/10 for fault-revealing sequence length. When applied to the top 1, 000 Google Play apps, Sapienz found 558 unique, previously unknown crashes. So far we have managed to make contact with the developers of 27 crashing apps. Of these, 14 have confirmed that the crashes are caused by real faults. Of those 14, six already have developer-confirmed fixes

    A Methodological Framework for Evaluating Software Testing Techniques and Tools

    Get PDF
    © 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.There exists a real need in industry to have guidelines on what testing techniques use for different testing objectives, and how usable (effective, efficient, satisfactory) these techniques are. Up to date, these guidelines do not exist. Such guidelines could be obtained by doing secondary studies on a body of evidence consisting of case studies evaluating and comparing testing techniques and tools. However, such a body of evidence is also lacking. In this paper, we will make a first step towards creating such body of evidence by defining a general methodological evaluation framework that can simplify the design of case studies for comparing software testing tools, and make the results more precise, reliable, and easy to compare. Using this framework, (1) software testing practitioners can more easily define case studies through an instantiation of the framework, (2) results can be better compared since they are all executed according to a similar design, (3) the gap in existing work on methodological evaluation frameworks will be narrowed, and (4) a body of evidence will be initiated. By means of validating the framework, we will present successful applications of this methodological framework to various case studies for evaluating testing tools in an industrial environment with real objects and real subjects.This work was funded by the European project FITTEST (ICT257574, 2010-2013) and Spanish National project CaSA-Calidad (TIN2010-12312-E, Ministerio de Ciencia e Innovación)Vos, TE.; Marín, B.; Escalona, MJ.; Marchetto, A. (2012). A Methodological Framework for Evaluating Software Testing Techniques and Tools. IEEE. https://doi.org/10.1109/QSIC.2012.16

    The Seed is Strong: Seeding Strategies in Search-Based Software Testing

    Full text link
    Abstract—Search-based techniques have been shown useful for the task of generating tests, for example in the case of object-oriented software. But, as for any meta-heuristic search, the efficiency is heavily dependent on many different factors; seeding is one such factor that may strongly influence this efficiency. In this paper, we evaluate new and typical strategies to seed the initial population as well as to seed values introduced during the search when generating tests for object-oriented code. We report the results of a large empirical analysis carried out on 20 Java projects (for a total of 1,752 public classes). Our experiments show with strong statistical confidence that, even for a testing tool that is already able to achieve high coverage, the use of appropriate seeding strategies can further improve performance. Keywords-test case generation; search-based testing; testing classes; search-based software engineering I

    Multiagent Learning Through Indirect Encoding

    Get PDF
    Designing a system of multiple, heterogeneous agents that cooperate to achieve a common goal is a difficult task, but it is also a common real-world problem. Multiagent learning addresses this problem by training the team to cooperate through a learning algorithm. However, most traditional approaches treat multiagent learning as a combination of multiple single-agent learning problems. This perspective leads to many inefficiencies in learning such as the problem of reinvention, whereby fundamental skills and policies that all agents should possess must be rediscovered independently for each team member. For example, in soccer, all the players know how to pass and kick the ball, but a traditional algorithm has no way to share such vital information because it has no way to relate the policies of agents to each other. In this dissertation a new approach to multiagent learning that seeks to address these issues is presented. This approach, called multiagent HyperNEAT, represents teams as a pattern of policies rather than individual agents. The main idea is that an agent’s location within a canonical team layout (such as a soccer team at the start of a game) tends to dictate its role within that team, called the policy geometry. For example, as soccer positions move from goal to center they become more offensive and less defensive, a concept that is compactly represented as a pattern. iii The first major contribution of this dissertation is a new method for evolving neural network controllers called HyperNEAT, which forms the foundation of the second contribution and primary focus of this work, multiagent HyperNEAT. Multiagent learning in this dissertation is investigated in predator-prey, room-clearing, and patrol domains, providing a real-world context for the approach. Interestingly, because the teams in multiagent HyperNEAT are represented as patterns they can scale up to an infinite number of multiagent policies that can be sampled from the policy geometry as needed. Thus the third contribution is a method for teams trained with multiagent HyperNEAT to dynamically scale their size without further learning. Fourth, the capabilities to both learn and scale in multiagent HyperNEAT are compared to the traditional multiagent SARSA(λ) approach in a comprehensive study. The fifth contribution is a method for efficiently learning and encoding multiple policies for each agent on a team to facilitate learning in multi-task domains. Finally, because there is significant interest in practical applications of multiagent learning, multiagent HyperNEAT is tested in a real-world military patrolling application with actual Khepera III robots. The ultimate goal is to provide a new perspective on multiagent learning and to demonstrate the practical benefits of training heterogeneous, scalable multiagent teams through generative encoding
    corecore