14,856 research outputs found
Is the Stack Distance Between Test Case and Method Correlated With Test Effectiveness?
Mutation testing is a means to assess the effectiveness of a test suite and
its outcome is considered more meaningful than code coverage metrics. However,
despite several optimizations, mutation testing requires a significant
computational effort and has not been widely adopted in industry. Therefore, we
study in this paper whether test effectiveness can be approximated using a more
light-weight approach. We hypothesize that a test case is more likely to detect
faults in methods that are close to the test case on the call stack than in
methods that the test case accesses indirectly through many other methods.
Based on this hypothesis, we propose the minimal stack distance between test
case and method as a new test measure, which expresses how close any test case
comes to a given method, and study its correlation with test effectiveness. We
conducted an empirical study with 21 open-source projects, which comprise in
total 1.8 million LOC, and show that a correlation exists between stack
distance and test effectiveness. The correlation reaches a strength up to 0.58.
We further show that a classifier using the minimal stack distance along with
additional easily computable measures can predict the mutation testing result
of a method with 92.9% precision and 93.4% recall. Hence, such a classifier can
be taken into consideration as a light-weight alternative to mutation testing
or as a preceding, less costly step to that.Comment: EASE 201
LittleDarwin: a Feature-Rich and Extensible Mutation Testing Framework for Large and Complex Java Systems
Mutation testing is a well-studied method for increasing the quality of a
test suite. We designed LittleDarwin as a mutation testing framework able to
cope with large and complex Java software systems, while still being easily
extensible with new experimental components. LittleDarwin addresses two
existing problems in the domain of mutation testing: having a tool able to work
within an industrial setting, and yet, be open to extension for cutting edge
techniques provided by academia. LittleDarwin already offers higher-order
mutation, null type mutants, mutant sampling, manual mutation, and mutant
subsumption analysis. There is no tool today available with all these features
that is able to work with typical industrial software systems.Comment: Pre-proceedings of the 7th IPM International Conference on
Fundamentals of Software Engineerin
A Model to Estimate First-Order Mutation Coverage from Higher-Order Mutation Coverage
The test suite is essential for fault detection during software development.
First-order mutation coverage is an accurate metric to quantify the quality of
the test suite. However, it is computationally expensive. Hence, the adoption
of this metric is limited. In this study, we address this issue by proposing a
realistic model able to estimate first-order mutation coverage using only
higher-order mutation coverage. Our study shows how the estimation evolves
along with the order of mutation. We validate the model with an empirical study
based on 17 open-source projects.Comment: 2016 IEEE International Conference on Software Quality, Reliability,
and Security. 9 page
Deciphering the Mechanisms of Developmental Disorders (DMDD): a new programme for phenotyping embryonic lethal mice
International efforts to test gene function in the mouse by the systematic knockout of each gene are creating many lines in which embryonic development is compromised. These homozygous lethal mutants represent a potential treasure trove for the biomedical community. Developmental biologists could exploit them in their studies of tissue differentiation and organogenesis; for clinical researchers they offer a powerful resource for investigating the origins of developmental diseases that affect newborns. Here, we outline a new programme of research in the UK aiming to kick-start research with embryonic lethal mouse lines. The 'Deciphering the Mechanisms of Developmental Disorders' (DMDD) programme has the ambitious goal of identifying all embryonic lethal knockout lines made in the UK over the next 5 years, and will use a combination of comprehensive imaging and transcriptomics to identify abnormalities in embryo structure and development. All data will be made freely available, enabling individual researchers to identify lines relevant to their research. The DMDD programme will coordinate its work with similar international efforts through the umbrella of the International Mouse Phenotyping Consortium [see accompanying Special Article (Adams et al., 2013)] and, together, these programmes will provide a novel database for embryonic development, linking gene identity with molecular profiles and morphology phenotypes
Recommended from our members
The Effectiveness of <i>t</i>-Way Test Data Generation
Modern society is increasingly dependent on the correct functioning of software and increasingly so in areas that are considered safety related or safety critical. Therefore, there is an increasing need to be able to verify and validate that the software is in fact correct and will perform its intended function. Many approaches to this problem have been proposed; however, none seems likely to supplant the role of testing in the near future.
If we accept that there is, and will be, a continuing need to be able to test software then the question becomes one of how can this be done effectively, both in terms of ability to detect errors and in terms of cost. One avenue of research that offers prospects of improving both of these aspects is the automatic generation of test data.
There has recently been a large amount of work conducted in this area. One particularly promising direction has been the application of ideas from the field of experimental design and in particular, the field of t-way adequate factorial designs.
The area however, is not without issues; there is evidence that the technique is capable of detecting errors but that evidence is not unequivocal. Moreover, as with almost all work in the area of automatic test generation, there has been very little comparative work comparing the technique with other test data generation techniques. Worse, there has been effectively no work done that compares any automatic test data generation technique with the effectiveness of tests generated by humans. Another major issue with the technique is the number of tests that applying the technique can result in. This implies that there is a need for an automated oracle if the technique is to be successfully applied. The flaw with this is of course that in most situations the oracle is the human that is conducting the tests, a point often ignored in testing research.
The work presented here addresses both of these points. To do this I have used a code base taken from an industrial engine control system that has an existing set of high quality unit tests developed by hand. To complement this, several other techniques for automatically generating test data have been applied, namely random testing, random experimental designs and a technique for generating single factor experiments. To address the issue of being able to compare the error detection ability of all of the sets of test vectors, rather than the usual effectiveness surrogates of code coverage I have used mutation analysis on the code base to directly measure the ability of each set of test vectors to discover common coding errors. The results presented here show that test data generation techniques based on t-way factorial designs are at least as effective as handgenerated tests and superior to random testing and the factor experimental technique.
The oracle problem associated with the factorial design techniques was addressed using a test set minimisation approach. The mutation tool monitored which vectors could “kill” which code mutants. After a subset of the test vectors had been run, the most effective vectors were retained and the rest discarded. Likewise, mutants that were killed were removed from further consideration and the process repeated. Experimental results show that this minimisation procedure is effective at reducing computational overhead and is capable of producing final sets of test vectors that are comparable in size with the sets of hand-generated tests and so amenable to final hand checking
- …