154 research outputs found
Empirical Evaluation of Mutation-based Test Prioritization Techniques
We propose a new test case prioritization technique that combines both
mutation-based and diversity-based approaches. Our diversity-aware
mutation-based technique relies on the notion of mutant distinguishment, which
aims to distinguish one mutant's behavior from another, rather than from the
original program. We empirically investigate the relative cost and
effectiveness of the mutation-based prioritization techniques (i.e., using both
the traditional mutant kill and the proposed mutant distinguishment) with 352
real faults and 553,477 developer-written test cases. The empirical evaluation
considers both the traditional and the diversity-aware mutation criteria in
various settings: single-objective greedy, hybrid, and multi-objective
optimization. The results show that there is no single dominant technique
across all the studied faults. To this end, \rev{we we show when and the reason
why each one of the mutation-based prioritization criteria performs poorly,
using a graphical model called Mutant Distinguishment Graph (MDG) that
demonstrates the distribution of the fault detecting test cases with respect to
mutant kills and distinguishment
Evaluating the Impact of Experimental Assumptions in Automated Fault Localization
peer reviewe
µBert: mutation testing using pre-trained language models
Mutation testing seeds faults using a predefined set of simple syntactic transformations, aka mutation operators, that are (typically) defined based on the grammar of the targeted programming language. As a result, mutation operators often alter the program semantics in ways that often lead to unnatural code (unnatural in the sense that the mutated code is unlikely to be produced by a competent programmer).
Such unnatural faults may not be convincing for developers as they might perceive them as unrealistic/uninteresting, thereby hindering the usability of the method. Additionally, the use of unnatural mutants may have actual impact on the guidance and assessment capabilities of mutation testing. This is because unnatural mutants often lead to exceptions, or segmentation faults, infinite loops and other trivial cases.
To deal with this issue, we propose forming mutants that are in some sense natural; meaning that the mutated code/statement follows the implicit rules, coding conventions and generally representativeness of the code produced by competent programmers. We define/capture this naturalness of mutants using language models trained on big code that learn (quantify) the occurrence of code tokens given their surrounding code.
We introduce µBert, a mutation testing tool that uses a pre-trained language model (CodeBERT) to generate mutants. This is done by masking a token from the expression given as input and using CodeBERT to predict it.Sociedad Argentina de Informática e Investigación Operativ
The "Baggaging" of Theory-Based Evaluation
We introduce SeMu, a Dynamic Symbolic Execution technique that generates test
inputs capable of killing stubborn mutants (killable mutants that remain
undetected after a reasonable amount of testing). SeMu aims at mutant
propagation (triggering erroneous states to the program output) by
incrementally searching for divergent program behaviours between the original
and the mutant versions. We model the mutant killing problem as a symbolic
execution search within a specific area in the programs' symbolic tree. In this
framework, the search area is defined and controlled by parameters that allow
scalable and cost-effective mutant killing. We integrate SeMu in KLEE and
experimented with Coreutils (a benchmark frequently used in symbolic execution
studies). Our results show that our modelling plays an important role in mutant
killing. Perhaps more importantly, our results also show that, within a
two-hour time limit, SeMu kills 37% of the stubborn mutants, where KLEE kills
none and where the mutant infection strategy (strategy suggested by previous
research) kills 17%
- …