2,852 research outputs found
Using Evolutionary Mutation Testing to improve the quality of test suites
Mutation testing is a method used to assess and improve the fault detection capability of a test suite by creating faulty versions, called mutants, of the system under test. Evolutionary Mutation Testing (EMT), like selective mutation or mutant sampling, was proposed to reduce the computational cost, which is a major concern when applying mutation testing. This technique implements an evolutionary algorithm to produce a reduced subset of mutants but with a high proportion of mutants that can help the tester derive new test cases (strong mutants). In this paper, we go a step further in estimating the ability of this technique to induce the generation of test cases. Instead of measuring the percentage of strong mutants within the subset of generated mutants, we compute how much the test suite is actually improved thanks to those mutants. In our experiments, we have compared the extent to which EMT and the random selection of mutants help to find missing test cases in C++ object-oriented systems. We can conclude from our results that the percentage of mutants generated with EMT is lower than with the random strategy to obtain a test suite of the same size and that the technique scales better for complex programs
Search algorithms for regression test case prioritization
Regression testing is an expensive, but important, process. Unfortunately, there may be insufficient resources to allow for the re-execution of all test cases during regression testing. In this situation, test case prioritisation techniques aim to improve the effectiveness of regression testing, by ordering the test cases so that the most beneficial are executed first. Previous work on regression test case prioritisation has focused on Greedy Algorithms. However, it is known that these algorithms may produce sub-optimal results, because they may construct results that denote only local minima within the search space. By contrast, meta-heuristic and evolutionary search algorithms aim to avoid such problems. This paper presents results from an empirical study of the application of several greedy, meta-heuristic and evolutionary search algorithms to six programs, ranging from 374 to 11,148 lines of code for 3 choices of fitness metric. The paper addresses the problems of choice of fitness metric, characterisation of landscape modality and determination of the most suitable search technique to apply. The empirical results replicate previous results concerning Greedy Algorithms. They shed light on the nature of the regression testing search space, indicating that it is multi-modal. The results also show that Genetic Algorithms perform well, although Greedy approaches are surprisingly effective, given the multi-modal nature of the landscape
Search-Based Mutant Selection for Efficient Test Suite Improvement: Evaluation and Results
Context: Search-based techniques have been applied to almost all areas in software engineering, especially to software testing, seeking to solve hard optimization problems. However, the problem of selecting mutants to improve the test suite at a lower cost has not been explored to the same extent as other problems, such as mutant selection for test suite evaluation or test data generation.
Objective: In this paper, we apply search-based mutant selection to enhance the quality of test suites efficiently. Namely, we use the technique known as Evolutionary Mutation Testing (EMT), which allows reducing the number of mutants while preserving the power to refine the test suite. Despite reported
benefits of its application, the existing empirical results were derived from a limited number of case studies, a particular set of mutation operators and a vague measure, which currently makes it difficult to determine the real performance of this technique.
Method: This paper addresses the shortcomings of previous studies, providing a new methodology to evaluate EMT on the basis of the actual improvement of the test suite achieved by using the evolutionary strategy. We make use of that methodology in new experiments with a carefully selected set of real-world C++ case studies. Results: EMT shows a good performance for most case studies and levels of demand of test suite improvement (around 45% less mutants than random selection in the best case). The results reveal that even a reduced subset of mutants selected with EMT can serve to increase confidence in the test suite, especially in programs with a large set of mutants.
Conclusions: These results support the use of search-based techniques to solve the problem of mutant selection for a more efficient test suite refinement. Additionally, we identify some aspects that could foreseeably help enhance EMT
Empirical Evaluation of Mutation-based Test Prioritization Techniques
We propose a new test case prioritization technique that combines both
mutation-based and diversity-based approaches. Our diversity-aware
mutation-based technique relies on the notion of mutant distinguishment, which
aims to distinguish one mutant's behavior from another, rather than from the
original program. We empirically investigate the relative cost and
effectiveness of the mutation-based prioritization techniques (i.e., using both
the traditional mutant kill and the proposed mutant distinguishment) with 352
real faults and 553,477 developer-written test cases. The empirical evaluation
considers both the traditional and the diversity-aware mutation criteria in
various settings: single-objective greedy, hybrid, and multi-objective
optimization. The results show that there is no single dominant technique
across all the studied faults. To this end, \rev{we we show when and the reason
why each one of the mutation-based prioritization criteria performs poorly,
using a graphical model called Mutant Distinguishment Graph (MDG) that
demonstrates the distribution of the fault detecting test cases with respect to
mutant kills and distinguishment
Automatic Repair of Real Bugs: An Experience Report on the Defects4J Dataset
Defects4J is a large, peer-reviewed, structured dataset of real-world Java
bugs. Each bug in Defects4J is provided with a test suite and at least one
failing test case that triggers the bug. In this paper, we report on an
experiment to explore the effectiveness of automatic repair on Defects4J. The
result of our experiment shows that 47 bugs of the Defects4J dataset can be
automatically repaired by state-of- the-art repair. This sets a baseline for
future research on automatic repair for Java. We have manually analyzed 84
different patches to assess their real correctness. In total, 9 real Java bugs
can be correctly fixed with test-suite based repair. This analysis shows that
test-suite based repair suffers from under-specified bugs, for which trivial
and incorrect patches still pass the test suite. With respect to practical
applicability, it takes in average 14.8 minutes to find a patch. The experiment
was done on a scientific grid, totaling 17.6 days of computation time. All
their systems and experimental results are publicly available on Github in
order to facilitate future research on automatic repair
Fuzzy Adaptive Tuning of a Particle Swarm Optimization Algorithm for Variable-Strength Combinatorial Test Suite Generation
Combinatorial interaction testing is an important software testing technique
that has seen lots of recent interest. It can reduce the number of test cases
needed by considering interactions between combinations of input parameters.
Empirical evidence shows that it effectively detects faults, in particular, for
highly configurable software systems. In real-world software testing, the input
variables may vary in how strongly they interact, variable strength
combinatorial interaction testing (VS-CIT) can exploit this for higher
effectiveness. The generation of variable strength test suites is a
non-deterministic polynomial-time (NP) hard computational problem
\cite{BestounKamalFuzzy2017}. Research has shown that stochastic
population-based algorithms such as particle swarm optimization (PSO) can be
efficient compared to alternatives for VS-CIT problems. Nevertheless, they
require detailed control for the exploitation and exploration trade-off to
avoid premature convergence (i.e. being trapped in local optima) as well as to
enhance the solution diversity. Here, we present a new variant of PSO based on
Mamdani fuzzy inference system
\cite{Camastra2015,TSAKIRIDIS2017257,KHOSRAVANIAN2016280}, to permit adaptive
selection of its global and local search operations. We detail the design of
this combined algorithm and evaluate it through experiments on multiple
synthetic and benchmark problems. We conclude that fuzzy adaptive selection of
global and local search operations is, at least, feasible as it performs only
second-best to a discrete variant of PSO, called DPSO. Concerning obtaining the
best mean test suite size, the fuzzy adaptation even outperforms DPSO
occasionally. We discuss the reasons behind this performance and outline
relevant areas of future work.Comment: 21 page
- …