12 research outputs found

    Will My Tests Tell Me If I Break This Code?

    Get PDF
    Automated tests play an important role in software evolution because they can rapidly detect faults introduced during changes. In practice, code-coverage metrics are often used as criteria to evaluate the effectiveness of test suites with focus on regression faults. However, code coverage only expresses which portion of a system has been executed by tests, but not how effective the tests actually are in detecting regression faults. Our goal was to evaluate the validity of code coverage as a measure for test effectiveness. To do so, we conducted an empirical study in which we applied an extreme mutation testing approach to analyze the tests of open-source projects written in Java. We assessed the ratio of pseudo-tested methods (those tested in a way such that faults would not be detected) to all covered methods and judged their impact on the software project. The results show that the ratio of pseudo-tested methods is acceptable for unit tests but not for system tests (that execute large portions of the whole system). Therefore, we conclude that the coverage metric is only a valid effectiveness indicator for unit tests.Comment: 7 pages, 3 figure

    Code coverage of adaptive random testing

    Get PDF
    Random testing is a basic software testing technique that can be used to assess the software reliability as well as to detect software failures. Adaptive random testing has been proposed to enhance the failure-detection capability of random testing. Previous studies have shown that adaptive random testing can use fewer test cases than random testing to detect the first software failure. In this paper, we evaluate and compare the performance of adaptive random testing and random testing from another perspective, that of code coverage. As shown in various investigations, a higher code coverage not only brings a higher failure-detection capability, but also improves the effectiveness of software reliability estimation. We conduct a series of experiments based on two categories of code coverage criteria: structure-based coverage, and fault-based coverage. Adaptive random testing can achieve higher code coverage than random testing with the same number of test cases. Our experimental results imply that, in addition to having a better failure-detection capability than random testing, adaptive random testing also delivers a higher effectiveness in assessing software reliability, and a higher confidence in the reliability of the software under test even when no failure is detected

    Code coverage of adaptive random testing

    Get PDF
    Random testing is a basic software testing technique that can be used to assess the software reliability as well as to detect software failures. Adaptive random testing has been proposed to enhance the failure-detection capability of random testing. Previous studies have shown that adaptive random testing can use fewer test cases than random testing to detect the first software failure. In this paper, we evaluate and compare the performance of adaptive random testing and random testing from another perspective, that of code coverage. As shown in various investigations, a higher code coverage not only brings a higher failure-detection capability, but also improves the effectiveness of software reliability estimation. We conduct a series of experiments based on two categories of code coverage criteria: structure-based coverage, and fault-based coverage. Adaptive random testing can achieve higher code coverage than random testing with the same number of test cases. Our experimental results imply that, in addition to having a better failure-detection capability than random testing, adaptive random testing also delivers a higher effectiveness in assessing software reliability, and a higher confidence in the reliability of the software under test even when no failure is detected

    Code Coverage of Adaptive Random Testing

    Full text link

    Predicting Test Suite Effectiveness for Java Programs

    Get PDF
    The coverage of a test suite is often used as a proxy for its effectiveness. However, previous studies that investigated the influence of code coverage on test suite effectiveness have failed to reach a consensus about the nature and strength of the relationship between these test suite characteristics. Moreover, many of the studies were done with small or synthetic programs, making it unclear that their results generalize to larger programs. In addition, some of the studies did not account for the confounding influence of test suite size. We have extended these studies by evaluating the relationship between test suite size, block coverage, and effectiveness for large Java programs. Our test subjects were four Java programs from different application domains: Apache POI, HSQLDB, JFreeChart, and Joda Time. All four are actively developed open source programs; they range from 80,000 to 284,000 source lines of code. For each test subject, we generated between 5,000 and 7,000 test suites by randomly selecting test methods from the program's entire test suite. The suites ranged in size from 3 to 3,000 methods. We used the coverage tool Emma to measure the block coverage of each suite and the mutation testing tool Javalanche to evaluate the effectiveness of each suite. We found that there is a low correlation between block coverage and effectiveness when the number of tests in the suite is controlled for. This suggests that block coverage, while useful for identifying under-tested parts of a program, should not be used as a quality target because it is not a good indicator of test suite effectiveness

    Model based test suite minimization using metaheuristics

    Get PDF
    Software testing is one of the most widely used methods for quality assurance and fault detection purposes. However, it is one of the most expensive, tedious and time consuming activities in software development life cycle. Code-based and specification-based testing has been going on for almost four decades. Model-based testing (MBT) is a relatively new approach to software testing where the software models as opposed to other artifacts (i.e. source code) are used as primary source of test cases. Models are simplified representation of a software system and are cheaper to execute than the original or deployed system. The main objective of the research presented in this thesis is the development of a framework for improving the efficiency and effectiveness of test suites generated from UML models. It focuses on three activities: transformation of Activity Diagram (AD) model into Colored Petri Net (CPN) model, generation and evaluation of AD based test suite and optimization of AD based test suite. Unified Modeling Language (UML) is a de facto standard for software system analysis and design. UML models can be categorized into structural and behavioral models. AD is a behavioral type of UML model and since major revision in UML version 2.x it has a new Petri Nets like semantics. It has wide application scope including embedded, workflow and web-service systems. For this reason this thesis concentrates on AD models. Informal semantics of UML generally and AD specially is a major challenge in the development of UML based verification and validation tools. One solution to this challenge is transforming a UML model into an executable formal model. In the thesis, a three step transformation methodology is proposed for resolving ambiguities in an AD model and then transforming it into a CPN representation which is a well known formal language with extensive tool support. Test case generation is one of the most critical and labor intensive activities in testing processes. The flow oriented semantic of AD suits modeling both sequential and concurrent systems. The thesis presented a novel technique to generate test cases from AD using a stochastic algorithm. In order to determine if the generated test suite is adequate, two test suite adequacy analysis techniques based on structural coverage and mutation have been proposed. In terms of structural coverage, two separate coverage criteria are also proposed to evaluate the adequacy of the test suite from both perspectives, sequential and concurrent. Mutation analysis is a fault-based technique to determine if the test suite is adequate for detecting particular types of faults. Four categories of mutation operators are defined to seed specific faults into the mutant model. Another focus of thesis is to improve the test suite efficiency without compromising its effectiveness. One way of achieving this is identifying and removing the redundant test cases. It has been shown that the test suite minimization by removing redundant test cases is a combinatorial optimization problem. An evolutionary computation based test suite minimization technique is developed to address the test suite minimization problem and its performance is empirically compared with other well known heuristic algorithms. Additionally, statistical analysis is performed to characterize the fitness landscape of test suite minimization problems. The proposed test suite minimization solution is extended to include multi-objective minimization. As the redundancy is contextual, different criteria and their combination can significantly change the solution test suite. Therefore, the last part of the thesis describes an investigation into multi-objective test suite minimization and optimization algorithms. The proposed framework is demonstrated and evaluated using prototype tools and case study models. Empirical results have shown that the techniques developed within the framework are effective in model based test suite generation and optimizatio

    Redefining and Evaluating Coverage Criteria Based on the Testing Scope

    Get PDF
    Test coverage information can help testers in deciding when to stop testing and in augmenting their test suites when the measured coverage is not deemed sufficient. Since the notion of a test criterion was introduced in the 70’s, research on coverage testing has been very active with much effort dedicated to the definition of new, more cost-effective, coverage criteria or to the adaptation of existing ones to a different domain. All these studies share the premise that after defining the entity to be covered (e.g., branches), one cannot consider a program to be adequately tested if some of its entities have never been exercised by any input data. However, it is not the case that all entities are of interest in every context. This is particularly true for several paradigms that emerged in the last decade (e.g., component-based development, service-oriented architecture). In such cases, traditional coverage metrics might not always provide meaningful information. In this thesis we address such situation and we redefine coverage criteria so to focus on the program parts that are relevant to the testing scope. We instantiate this general notion of scope-based coverage by introducing three coverage criteria and we demonstrate how they could be applied to different testing contexts. When applied to the context of software reuse, our approach proved to be useful for supporting test case prioritization, selection and minimization. Our studies showed that for prioritization we can improve the average rate of faults detected. For test case selection and minimization, we can considerably reduce the test suite size with small to no extra impact on fault detection effectiveness. When the source code is not available, such as in the service-oriented architecture paradigm, we propose an approach that customizes coverage, measured on invocations at service interface, based on data from similar users. We applied this approach to a real world application and, in our study, we were able to predict the entities that would be of interest for a given user with high precision. Finally, we introduce the first of its kind coverage criterion for operational profile based testing that exploits program spectra obtained from usage traces. Our study showed that it is better correlated than traditional coverage with the probability that the next test input will fail, which implies that our approach can provide a better stopping rule. Promising results were also observed for test case selection. Our redefinition of coverage criteria approaches the topic of coverage testing from a completely different angle. Such a novel perspective paves the way for new avenues of research towards improving the cost-effectiveness of testing, yet all to be explored
    corecore