898 research outputs found
EvoSuite at the SBST 2016 Tool Competition
EvoSuite is a search-based tool that automatically generates unit tests for Java code. This paper summarizes the results and experiences of EvoSuite's participation at the fourth unit testing competition at SBST 2016, where Evo-Suite achieved the highest overall score
Faster Mutation Analysis via Equivalence Modulo States
Mutation analysis has many applications, such as asserting the quality of
test suites and localizing faults. One important bottleneck of mutation
analysis is scalability. The latest work explores the possibility of reducing
the redundant execution via split-stream execution. However, split-stream
execution is only able to remove redundant execution before the first mutated
statement.
In this paper we try to also reduce some of the redundant execution after the
execution of the first mutated statement. We observe that, although many
mutated statements are not equivalent, the execution result of those mutated
statements may still be equivalent to the result of the original statement. In
other words, the statements are equivalent modulo the current state.
In this paper we propose a fast mutation analysis approach, AccMut. AccMut
automatically detects the equivalence modulo states among a statement and its
mutations, then groups the statements into equivalence classes modulo states,
and uses only one process to represent each class. In this way, we can
significantly reduce the number of split processes. Our experiments show that
our approach can further accelerate mutation analysis on top of split-stream
execution with a speedup of 2.56x on average.Comment: Submitted to conferenc
Learning How to Search: Generating Exception-Triggering Tests Through Adaptive Fitness Function Selection
Search-based test generation is guided by feedback from one or more fitness functions—scoring functions that judge solution optimality. Choosing informative fitness functions is crucial to meeting the goals of a tester. Unfortunately, many goals—such as forcing the class-under-test to throw exceptions— do not have a known fitness function formulation. We propose that meeting such goals requires treating fitness function identification as a secondary optimization step. An adaptive algorithm that can vary the selection of fitness functions could adjust its selection throughout the generation process to maximize goal attainment, based on the current population of test suites. To test this hypothesis, we have implemented two reinforcement learning algorithms in the EvoSuite framework, and used these algorithms to dynamically set the fitness functions used during generation.We have evaluated our framework, EvoSuiteFIT, on a set of 386 real faults. EvoSuiteFIT discovers and retains more exception-triggering input and produces suites that detect a variety of faults missed by the other techniques. The ability to adjust fitness functions allows EvoSuiteFIT to make strategic choices that efficiently produce more effective test suites
Fuzzing for CPS Mutation Testing
Mutation testing can help reduce the risks of releasing faulty software. For
such reason, it is a desired practice for the development of embedded software
running in safety-critical cyber-physical systems (CPS). Unfortunately,
state-of-the-art test data generation techniques for mutation testing of C and
C++ software, two typical languages for CPS software, rely on symbolic
execution, whose limitations often prevent its application (e.g., it cannot
test black-box components).
We propose a mutation testing approach that leverages fuzz testing, which has
proved effective with C and C++ software. Fuzz testing automatically generates
diverse test inputs that exercise program branches in a varied number of ways
and, therefore, exercise statements in different program states, thus
maximizing the likelihood of killing mutants, our objective.
We performed an empirical assessment of our approach with software components
used in satellite systems currently in orbit. Our empirical evaluation shows
that mutation testing based on fuzz testing kills a significantly higher
proportion of live mutants than symbolic execution (i.e., up to an additional
47 percentage points). Further, when symbolic execution cannot be applied, fuzz
testing provides significant benefits (i.e., up to 41% mutants killed). Our
study is the first one comparing fuzz testing and symbolic execution for
mutation testing; our results provide guidance towards the development of fuzz
testing tools dedicated to mutation testing.Comment: This article is the camera-ready version for ASE 202
Diversifying focused testing for unit testing
Software changes constantly because developers add new features or modifications. This directly affects the effectiveness of the testsuite associated with that software, especially when these new modifications are in a specific area that no test case covers. This paper tackles the problem of generating a high quality test suite to cover repeatedly a given point in a program, with the ultimate goal of exposing faults possibly affecting the given program point. Both search based software testing and constraint solving offer ready, but low quality, solutions to this: ideally a maximally diverse covering test set is required whereas search and constraint solving tend to generate test sets with biased distributions. Our approach, Diversified Focused Testing (DFT), uses a search strategy inspired by GödelTest. We artificially inject parameters into the code branching conditions and use a bi-objective search algorithm to find diverse inputs by perturbing the injected parameters, while keeping the path conditions still satisfiable. Our results demonstrate that our technique, DFT, is able to cover a desired point in the code at least 90% of the time. Moreover, adding diversity improves the bug detection and the mutation killing abilities of the test suites. We show that DFT achieves better results than focused testing, symbolic execution and random testing by achieving from 3% to 70% improvement in mutation score and up to 100% improvement in fault detection across 105 software subjects
- …