11 research outputs found

    JUGE: An Infrastructure for Benchmarking Java Unit Test Generators

    Full text link
    Researchers and practitioners have designed and implemented various automated test case generators to support effective software testing. Such generators exist for various languages (e.g., Java, C#, or Python) and for various platforms (e.g., desktop, web, or mobile applications). Such generators exhibit varying effectiveness and efficiency, depending on the testing goals they aim to satisfy (e.g., unit-testing of libraries vs. system-testing of entire applications) and the underlying techniques they implement. In this context, practitioners need to be able to compare different generators to identify the most suited one for their requirements, while researchers seek to identify future research directions. This can be achieved through the systematic execution of large-scale evaluations of different generators. However, the execution of such empirical evaluations is not trivial and requires a substantial effort to collect benchmarks, setup the evaluation infrastructure, and collect and analyse the results. In this paper, we present our JUnit Generation benchmarking infrastructure (JUGE) supporting generators (e.g., search-based, random-based, symbolic execution, etc.) seeking to automate the production of unit tests for various purposes (e.g., validation, regression testing, fault localization, etc.). The primary goal is to reduce the overall effort, ease the comparison of several generators, and enhance the knowledge transfer between academia and industry by standardizing the evaluation and comparison process. Since 2013, eight editions of a unit testing tool competition, co-located with the Search-Based Software Testing Workshop, have taken place and used and updated JUGE. As a result, an increasing amount of tools (over ten) from both academia and industry have been evaluated on JUGE, matured over the years, and allowed the identification of future research directions

    Java Unit Testing Tool Competition — Fifth Round

    Get PDF
    After four successful JUnit tool competitions, we report on the achievements of a new Java Unit Testing Tool Competition. This 5th contest introduces statistical analyses in the benchmark infrastructure and has been validated with significance against the results of the previous 4th edition. Overall, the competition evaluates four automated JUnit testing tools taking as baseline human written test cases from real projects. The paper details the modifications performed to the methodology and provides full results of the competition

    Message from the workshop chairs

    Get PDF

    Automatic Test Data Generation Using Constraint Programming and Search Based Software Engineering Techniques

    Get PDF
    RÉSUMÉ Prouver qu'un logiciel correspond Ă  sa spĂ©cification ou exposer des erreurs cachĂ©es dans son implĂ©mentation est une tĂąche de test trĂšs difficile, fastidieuse et peut coĂ»ter plus de 50% de coĂ»t total du logiciel. Durant la phase de test du logiciel, la gĂ©nĂ©ration des donnĂ©es de test est l'une des tĂąches les plus coĂ»teuses. Par consĂ©quent, l'automatisation de cette tĂąche permet de rĂ©duire considĂ©rablement le coĂ»t du logiciel, le temps de dĂ©veloppement et les dĂ©lais de commercialisation. Plusieurs travaux de recherche ont proposĂ© des approches automatisĂ©es pour gĂ©nĂ©rer des donnĂ©es de test. Certains de ces travaux ont montrĂ© que les techniques de gĂ©nĂ©ration des donnĂ©es de test qui sont basĂ©es sur des mĂ©taheuristiques (SB-STDG) peuvent gĂ©nĂ©rer automatiquement des donnĂ©es de test. Cependant, ces techniques sont trĂšs sensibles Ă  leur orientation qui peut avoir un impact sur l'ensemble du processus de gĂ©nĂ©ration des donnĂ©es de test. Une insuffisance d'informations pertinentes sur le problĂšme de gĂ©nĂ©ration des donnĂ©es de test peut affaiblir l'orientation et affecter nĂ©gativement l'efficacitĂ© et l'effectivitĂ© de SB-STDG. Dans cette thĂšse, notre proposition de recherche est d'analyser statiquement le code source pour identifier et extraire des informations pertinentes afin de les exploiter dans le processus de SB-STDG pourrait offrir davantage d'orientation et ainsi d'amĂ©liorer l'efficacitĂ© et l'effectivitĂ© de SB-STDG. Pour extraire des informations pertinentes pour l'orientation de SB-STDG, nous analysons de maniĂšre statique la structure interne du code source en se concentrant sur six caractĂ©ristiques, i.e., les constantes, les instructions conditionnelles, les arguments, les membres de donnĂ©es, les mĂ©thodes et les relations. En mettant l'accent sur ces caractĂ©ristiques et en utilisant diffĂ©rentes techniques existantes d'analyse statique, i.e, la programmation par contraintes (CP), la thĂ©orie du schĂ©ma et certains analyses statiques lĂ©gĂšres, nous proposons quatre approches: (1) en mettant l'accent sur les arguments et les instructions conditionnelles, nous dĂ©finissons une approche hybride qui utilise les techniques de CP pour guider SB-STDG Ă  rĂ©duire son espace de recherche; (2) en mettant l'accent sur les instructions conditionnelles et en utilisant des techniques de CP, nous dĂ©finissons deux nouvelles mĂ©triques qui mesurent la difficultĂ© Ă  satisfaire une branche (i.e., condition), d'o˘ nous tirons deux nouvelles fonctions objectif pour guider SB-STDG; (3) en mettant l'accent sur les instructions conditionnelles et en utilisant la thĂ©orie du schĂ©ma, nous adaptons l'algorithme gĂ©nĂ©tique pour mieux rĂ©pondre au problĂšme de la gĂ©nĂ©ration de donnĂ©es de test; (4) en mettant l'accent sur les arguments, les instructions conditionnelles, les constantes, les membres de donnĂ©es, les mĂ©thodes et les relations, et en utilisant des analyses statiques lĂ©gĂšres, nous dĂ©finissons un gĂ©nĂ©rateur d'instance qui gĂ©nĂšre des donnĂ©es de test candidates pertinentes et une nouvelle reprĂ©sentation du problĂšme de gĂ©nĂ©ration des donnĂ©es de test orientĂ©-objet qui rĂ©duit implicitement l'espace de recherche de SB-STDG. Nous montrons que les analyses statiques aident Ă  amĂ©liorer l'efficacitĂ© et l'effectivitĂ© de SB-STDG. Les rĂ©sultats obtenus dans cette thĂšse montrent des amĂ©liorations importantes en termes d'efficacitĂ© et d'effectivitĂ©. Ils sont prometteurs et nous espĂ©rons que d'autres recherches dans le domaine de la gĂ©nĂ©ration des donnĂ©es de test pourraient amĂ©liorer davantage l'efficacitĂ© ou l'effectivitĂ©.----------ABSTRACT Proving that some software system corresponds to its specification or revealing hidden errors in its implementation is a time consuming and tedious testing process, accounting for 50% of the total software. Test-data generation is one of the most expensive parts of the software testing phase. Therefore, automating this task can significantly reduce software cost, development time, and time to market. Many researchers have proposed automated approaches to generate test data. Among the proposed approaches, the literature showed that Search-Based Software Test-data Generation (SB-STDG) techniques can automatically generate test data. However, these techniques are very sensitive to their guidance which impact the whole test-data generation process. The insufficiency of information relevant about the test-data generation problem can weaken the SB-STDG guidance and negatively affect its efficiency and effectiveness. In this dissertation, our thesis is statically analyzing source code to identify and extract relevant information to exploit them in the SB-STDG process could offer more guidance and thus improve the efficiency and effectiveness of SB-STDG. To extract information relevant for SB-STDG guidance, we statically analyze the internal structure of the source code focusing on six features, i.e., constants, conditional statements, arguments, data members, methods, and relationships. Focusing on these features and using different existing techniques of static analysis, i.e., constraints programming (CP), schema theory, and some lightweight static analyses, we propose four approaches: (1) focusing on arguments and conditional statements, we define a hybrid approach that uses CP techniques to guide SB-STDG in reducing its search space; (2) focusing on conditional statements and using CP techniques, we define two new metrics that measure the difficulty to satisfy a branch, hence we derive two new fitness functions to guide SB-STDG; (3) focusing on conditional statements and using schema theory, we tailor genetic algorithm to better fit the problem of test-data generation; (4) focusing on arguments, conditional statements, constants, data members, methods, and relationships, and using lightweight static analyses, we define an instance generator that generates relevant test-data candidates and a new representation of the problem of object-oriented test-data generation that implicitly reduces the SB-STDG search space. We show that using static analyses improve the SB-STDG efficiency and effectiveness. The achieved results in this dissertation show an important improvements in terms of effectiveness and efficiency. They are promising and we hope that further research in the field of test-data generation might improve efficiency or effectiveness

    Doctor of Philosophy

    Get PDF
    dissertationIn computer science, functional software testing is a method of ensuring that software gives expected output on specific inputs. Software testing is conducted to ensure desired levels of quality in light of uncertainty resulting from the complexity of software. Most of today's software is written by people and software development is a creative activity. However, due to the complexity of computer systems and software development processes, this activity leads to a mismatch between the expected software functionality and the implemented one. If not addressed in a timely and proper manner, this mismatch can cause serious consequences to users of the software, such as security and privacy breaches, financial loss, and adversarial human health issues. Because of manual effort, software testing is costly. Software testing that is performed without human intervention is automatic software testing and it is one way of addressing the issue. In this work, we build upon and extend several techniques for automatic software testing. The techniques do not require any guidance from the user. Goals that are achieved with the techniques are checking for yet unknown errors, automatically testing object-oriented software, and detecting malicious software. To meet these goals, we explored several techniques and related challenges: automatic test case generation, runtime verification, dynamic symbolic execution, and the type and size of test inputs for efficient detection of malicious software via machine learning. Our work targets software written in the Java programming language, though the techniques are general and applicable to other languages. We performed an extensive evaluation on freely available Java software projects, a flight collision avoidance system, and thousands of applications for the Android operating system. Evaluation results show to what extent dynamic symbolic execution is applicable in testing object-oriented software, they show correctness of the flight system on millions of automatically customized and generated test cases, and they show that simple and relatively small inputs in random testing can lead to effective malicious software detection

    Automatic generation of smell-free unit tests

    Get PDF
    Tese de mestrado, Engenharia InformĂĄtica, 2022, Universidade de Lisboa, Faculdade de CiĂȘnciasAutomated test generation tools (such as EvoSuite) typically aim to maximize code coverage. However, they frequently disregard non-coverage aspects that can be relevant for testers, such as the quality of the generated tests. Therefore, automatically generated tests are often affected by a set of test-specific bad programming practices that may hinder the quality of both test and production code, i.e., test smells. Given that other researchers have successfully integrated non-coverage quality metrics into EvoSuite, we decided to extend the EvoSuite tool such that the generated test code is smell-free. To this aim, we compiled 54 test smells from several sources and selected 16 smells that are relevant to the context of this work. We then augmented the tool with the respective test smell metrics and investigated the diffusion of the selected smells and the distribution of the metrics. Finally, we implemented an approach to optimize the test smell metrics as secondary criteria. After establishing the optimal configuration to optimize as secondary criteria (which we used throughout the remainder of the study), we conducted an empirical study to assess whether the tests became significantly less smelly. Furthermore, we studied how the proposed metrics affect the fault detection effectiveness, coverage, and size of the generated tests. Our study revealed that the proposed approach reduces the overall smelliness of the generated tests; in particular, the diffusion of the “Indirect Testing” and “Unrelated Assertions” smells improved considerably. Moreover, our approach improved the smelliness of the tests generated by EvoSuite without compromising the code coverage or fault detection effectiveness. The size and length of the generated tests were also not affected by the new secondary criteria

    Improving Readability in Automatic Unit Test Generation

    Get PDF
    In object-oriented programming, quality assurance is commonly provided through writing unit tests, to exercise the operations of each class. If unit tests are created and maintained manually, this can be a time-consuming and laborious task. For this reason, automatic methods are often used to generate tests that seek to cover all paths of the tested code. Search may be guided by criteria that are opaque to the programmer, resulting in test sequences that are long and confusing. This has a negative impact on test maintenance. Once tests have been created, the job is not done: programmers need to reason about the tests throughout the lifecycle, as the tested software units evolve. Maintenance includes diagnosing failing tests (whether due to a software fault or an invalid test) and preserving test oracles (ensuring that checked assertions are still relevant). Programmers also need to understand the tests created for code that they did not write themselves, in order to understand the intent of that code. If generated tests cannot be easily understood, then they will be extremely difficult to maintain. The overall objective of this thesis is to reaffirm the importance of unit test maintenance; and to offer novel techniques to improve the readability of automatically generated tests. The first contribution is an empirical survey of 225 developers from different parts of the world, who were asked to give their opinions about unit testing practices and problems. The survey responses confirm that unit testing is considered important; and that there is an appetite for higher-quality automated test generation, with a view to test maintenance. The second contribution is a domain-specific model of unit test readability, based on human judgements. The model is used to augment automated unit test generation to produce test suites with both high coverage and improved readability. In evaluations, 30 programmers preferred our improved tests and were able to answer maintenance questions 14level of accuracy. The third contribution is a novel algorithm for generating descriptive test names that summarise API- level coverage goals. Test optimisation ensures that each test is short, bears a clear relation to the covered code, and can be readily identified by programmers. In evaluations, 47 programmers agreed with the choice of synthesised names and that these were as descriptive as manually chosen names. Participants were also more accurate and faster at matching generated tests against the tested code, compared to matching with manually-chosen test names

    Exploring means to facilitate software debugging

    Get PDF
    In this thesis, several aspects of software debugging from automated crash reproduction to bug report analysis and use of contracts have been studied.Algorithms and the Foundations of Software technolog

    Automated Test Case Generation as a Many-Objective Optimisation Problem with Dynamic Selection of the Targets

    Get PDF
    The test case generation is intrinsically a multi-objective problem, since the goal is covering multiple test targets (e.g., branches). Existing search-based approaches either consider one target at a time or aggregate all targets into a single fitness function (whole-suite approach). Multi and many-objective optimisation algorithms (MOAs) have never been applied to this problem, because existing algorithms do not scale to the number of coverage objectives that are typically found in real-world software. In addition, the final goal for MOAs is to find alternative trade-off solutions in the objective space, while in test generation the interesting solutions are only those test cases covering one or more uncovered targets. In this paper, we present DynaMOSA (Dynamic Many-Objective Sorting Algorithm), a novel many-objective solver specifically designed to address the test case generation problem in the context of coverage testing. DynaMOSA extends our previous many-objective technique MOSA (Many-Objective Sorting Algorithm) with dynamic selection of the coverage targets based on the control dependency hierarchy. Such extension makes the approach more effective and efficient in case of limited search budget. We carried out an empirical study on 346 Java classes using three coverage criteria (i.e., statement, branch, and strong mutation coverage) to assess the performance of DynaMOSA with respect to the whole-suite approach (WS), its archive-based variant (WSA) and MOSA. The results show that DynaMOSA outperforms WSA in 28% of the classes for branch coverage (+8% more coverage on average) and in 27% of the classes for mutation coverage (+11% more killed mutants on average). It outperforms WS in 51% of the classes for statement coverage, leading to +11% more coverage on average. Moreover, DynaMOSA outperforms its predecessor MOSA for all the three coverage criteria in 19% of the classes with +8% more code coverage on average

    Search-based Unit Test Generation for Evolving Software

    Get PDF
    Search-based software testing has been successfully applied to generate unit test cases for object-oriented software. Typically, in search-based test generation approaches, evolutionary search algorithms are guided by code coverage criteria such as branch coverage to generate tests for individual coverage objectives. Although it has been shown that this approach can be effective, there remain fundamental open questions. In particular, which criteria should test generation use in order to produce the best test suites? Which evolutionary algorithms are more effective at generating test cases with high coverage? How to scale up search-based unit test generation to software projects consisting of large numbers of components, evolving and changing frequently over time? As a result, the applicability of search-based test generation techniques in practice is still fundamentally limited. In order to answer these fundamental questions, we investigate the following improvements to search-based testing. First, we propose the simultaneous optimisation of several coverage criteria at the same time using an evolutionary algorithm, rather than optimising for individual criteria. We then perform an empirical evaluation of different evolutionary algorithms to understand the influence of each one on the test optimisation problem. We then extend a coverage-based test generation with a non-functional criterion to increase the likelihood of detecting faults as well as helping developers to identify the locations of the faults. Finally, we propose several strategies and tools to efficiently apply search-based test generation techniques in large and evolving software projects. Our results show that, overall, the optimisation of several coverage criteria is efficient, there is indeed an evolutionary algorithm that clearly works better for test generation problem than others, the extended coverage-based test generation is effective at revealing and localising faults, and our proposed strategies, specifically designed to test entire software projects in a continuous way, improve efficiency and lead to higher code coverage. Consequently, the techniques and toolset presented in this thesis - which provides support to all contributions here described - brings search-based software testing one step closer to practical usage, by equipping software engineers with the state of the art in automated test generation
    corecore