2,477 research outputs found
Semantic mutation testing
This is the Pre-print version of the Article. The official published version can be obtained from the link below - Copyright @ 2011 ElsevierMutation testing is a powerful and flexible test technique. Traditional mutation testing makes a small change to the syntax of a description (usually a program) in order to create a mutant. A test suite is considered to be good if it distinguishes between the original description and all of the (functionally non-equivalent) mutants. These mutants can be seen as representing potential small slips and thus mutation testing aims to produce a test suite that is good at finding such slips. It has also been argued that a test suite that finds such small changes is likely to find larger changes. This paper describes a new approach to mutation testing, called semantic mutation testing. Rather than mutate the description, semantic mutation testing mutates the semantics of the language in which the description is written. The mutations of the semantics of the language represent possible misunderstandings of the description language and thus capture a different class of faults. Since the likely misunderstandings are highly context dependent, this context should be used to determine which semantic mutants should be produced. The approach is illustrated through examples with statecharts and C code. The paper also describes a semantic mutation testing tool for C and the results of experiments that investigated the nature of some semantic mutation operators for C
Potential Errors and Test Assessment in Software Product Line Engineering
Software product lines (SPL) are a method for the development of variant-rich
software systems. Compared to non-variable systems, testing SPLs is extensive
due to an increasingly amount of possible products. Different approaches exist
for testing SPLs, but there is less research for assessing the quality of these
tests by means of error detection capability. Such test assessment is based on
error injection into correct version of the system under test. However to our
knowledge, potential errors in SPL engineering have never been systematically
identified before. This article presents an overview over existing paradigms
for specifying software product lines and the errors that can occur during the
respective specification processes. For assessment of test quality, we leverage
mutation testing techniques to SPL engineering and implement the identified
errors as mutation operators. This allows us to run existing tests against
defective products for the purpose of test assessment. From the results, we
draw conclusions about the error-proneness of the surveyed SPL design paradigms
and how quality of SPL tests can be improved.Comment: In Proceedings MBT 2015, arXiv:1504.0192
Amortising the Cost of Mutation Based Fault Localisation using Statistical Inference
Mutation analysis can effectively capture the dependency between source code
and test results. This has been exploited by Mutation Based Fault Localisation
(MBFL) techniques. However, MBFL techniques suffer from the need to expend the
high cost of mutation analysis after the observation of failures, which may
present a challenge for its practical adoption. We introduce SIMFL (Statistical
Inference for Mutation-based Fault Localisation), an MBFL technique that allows
users to perform the mutation analysis in advance against an earlier version of
the system. SIMFL uses mutants as artificial faults and aims to learn the
failure patterns among test cases against different locations of mutations.
Once a failure is observed, SIMFL requires either almost no or very small
additional cost for analysis, depending on the used inference model. An
empirical evaluation of SIMFL using 355 faults in Defects4J shows that SIMFL
can successfully localise up to 103 faults at the top, and 152 faults within
the top five, on par with state-of-the-art alternatives. The cost of mutation
analysis can be further reduced by mutation sampling: SIMFL retains over 80% of
its localisation accuracy at the top rank when using only 10% of generated
mutants, compared to results obtained without sampling
Will My Tests Tell Me If I Break This Code?
Automated tests play an important role in software evolution because they can
rapidly detect faults introduced during changes. In practice, code-coverage
metrics are often used as criteria to evaluate the effectiveness of test suites
with focus on regression faults. However, code coverage only expresses which
portion of a system has been executed by tests, but not how effective the tests
actually are in detecting regression faults. Our goal was to evaluate the
validity of code coverage as a measure for test effectiveness. To do so, we
conducted an empirical study in which we applied an extreme mutation testing
approach to analyze the tests of open-source projects written in Java. We
assessed the ratio of pseudo-tested methods (those tested in a way such that
faults would not be detected) to all covered methods and judged their impact on
the software project. The results show that the ratio of pseudo-tested methods
is acceptable for unit tests but not for system tests (that execute large
portions of the whole system). Therefore, we conclude that the coverage metric
is only a valid effectiveness indicator for unit tests.Comment: 7 pages, 3 figure
Test Case Purification for Improving Fault Localization
Finding and fixing bugs are time-consuming activities in software
development. Spectrum-based fault localization aims to identify the faulty
position in source code based on the execution trace of test cases. Failing
test cases and their assertions form test oracles for the failing behavior of
the system under analysis. In this paper, we propose a novel concept of
spectrum driven test case purification for improving fault localization. The
goal of test case purification is to separate existing test cases into small
fractions (called purified test cases) and to enhance the test oracles to
further localize faults. Combining with an original fault localization
technique (e.g., Tarantula), test case purification results in better ranking
the program statements. Our experiments on 1800 faults in six open-source Java
programs show that test case purification can effectively improve existing
fault localization techniques
Adversarial Sample Detection for Deep Neural Network through Model Mutation Testing
Deep neural networks (DNN) have been shown to be useful in a wide range of
applications. However, they are also known to be vulnerable to adversarial
samples. By transforming a normal sample with some carefully crafted human
imperceptible perturbations, even highly accurate DNN make wrong decisions.
Multiple defense mechanisms have been proposed which aim to hinder the
generation of such adversarial samples. However, a recent work show that most
of them are ineffective. In this work, we propose an alternative approach to
detect adversarial samples at runtime. Our main observation is that adversarial
samples are much more sensitive than normal samples if we impose random
mutations on the DNN. We thus first propose a measure of `sensitivity' and show
empirically that normal samples and adversarial samples have distinguishable
sensitivity. We then integrate statistical hypothesis testing and model
mutation testing to check whether an input sample is likely to be normal or
adversarial at runtime by measuring its sensitivity. We evaluated our approach
on the MNIST and CIFAR10 datasets. The results show that our approach detects
adversarial samples generated by state-of-the-art attacking methods efficiently
and accurately.Comment: Accepted by ICSE 201
Detecting Trivial Mutant Equivalences via Compiler Optimisations
Mutation testing realises the idea of fault-based testing, i.e., using artificial defects to guide the testing process. It is used to evaluate the adequacy of test suites and to guide test case generation. It is a potentially powerful form of testing, but it is well-known that its effectiveness is inhibited by the presence of equivalent mutants. We recently studied Trivial Compiler Equivalence (TCE) as a simple, fast and readily applicable technique for identifying equivalent mutants for C programs. In the present work, we augment our findings with further results for the Java programming language. TCE can remove a large portion of all mutants because they are determined to be either equivalent or duplicates of other mutants. In particular, TCE equivalent mutants account for 7.4% and 5.7% of all C and Java mutants, while duplicated mutants account for a further 21% of all C mutants and 5.4% Java mutants, on average. With respect to a benchmark ground truth suite (of known equivalent mutants), approximately 30% (for C) and 54% (for Java) are TCE equivalent. It is unsurprising that results differ between languages, since mutation characteristics are language-dependent. In the case of Java, our new results suggest that TCE may be particularly effective, finding almost half of all equivalent mutants
- …