Search CORE

65 research outputs found

Contextual Predictive Mutation Testing

Author: Alon Uri
Goues Claire Le
Groce Alex
Jain Kush
Publication venue
Publication date: 05/09/2023
Field of study

Mutation testing is a powerful technique for assessing and improving test suite quality that artificially introduces bugs and checks whether the test suites catch them. However, it is also computationally expensive and thus does not scale to large systems and projects. One promising recent approach to tackling this scalability problem uses machine learning to predict whether the tests will detect the synthetic bugs, without actually running those tests. However, existing predictive mutation testing approaches still misclassify 33% of detection outcomes on a randomly sampled set of mutant-test suite pairs. We introduce MutationBERT, an approach for predictive mutation testing that simultaneously encodes the source method mutation and test method, capturing key context in the input representation. Thanks to its higher precision, MutationBERT saves 33% of the time spent by a prior approach on checking/verifying live mutants. MutationBERT, also outperforms the state-of-the-art in both same project and cross project settings, with meaningful improvements in precision, recall, and F1 score. We validate our input representation, and aggregation approaches for lifting predictions from the test matrix level to the test suite level, finding similar improvements in performance. MutationBERT not only enhances the state-of-the-art in predictive mutation testing, but also presents practical benefits for real-world applications, both in saving developer time and finding hard to detect mutants

arXiv.org e-Print Archive

Predictive Mutation Testing

Author: Hao D
Harman M
Jia Y
Zhang J
Zhang L
Zhang L
Publication venue
Publication date: 27/02/2018
Field of study

IEEE Test suites play a key role in ensuring software quality. A good test suite may detect more faults than a poor-quality one. Mutation testing is a powerful methodology for evaluating the fault-detection ability of test suites. In mutation testing, a large number of mutants may be generated and need to be executed against the test suite under evaluation to check how many mutants the test suite is able to detect, as well as the kind of mutants that the current test suite fails to detect. Consequently, although highly effective, mutation testing is widely recognized to be also computationally expensive, inhibiting wider uptake. To alleviate this efficiency concern, we propose Predictive Mutation Testing (PMT): the first approach to predicting mutation testing results without executing mutants. In particular, PMT constructs a classification model, based on a series of features related to mutants and tests, and uses the model to predict whether a mutant would be killed or remain alive without executing it. PMT has been evaluated on 163 real-world projects under two application scenarios (cross-version and cross-project). The experimental results demonstrate that PMT improves the efficiency of mutation testing by up to 151.4X while incurring only a small accuracy loss. It achieves above 0.80 AUC values for the majority of projects, indicating a good tradeoff between the efficiency and effectiveness of predictive mutation testing. Also, PMT is shown to perform well on different tools and tests, be robust in the presence of imbalanced data, and have high predictability (over 60% confidence) when predicting the execution results of the majority of mutants

UCL Discovery

Predicting Survived and Killed Mutants

Author: Doliashvili Natia
Publication venue
Publication date: 01/01/2019
Field of study

Mutatsioonitestimine on tarkvaratestimises kasutatav meetod hindamaks testikomplekti kvaliteeti. Hindamise ajal genereeritakse programmi lähtekoodist suur hulk mutante ja jooksutatakse nende peal testikomplekti. Tapetud mutantide osakaal kõigist mutantidest näitab testikomplekti headust. Eesmärk on mõista, kas testid suudavad leida muteerunud koodi, andes sellega infot testide kvaliteedi kohta. Mutatsioonitestimine on äärmiselt kulukas ja aeganõudev meetod, kuna kõikidel mutantidel peab ükshaaval jooksutama terve testikomplekti. Käesolevas töös uuritakse ennustavat mutatsioonitestimise meetodit, mille toel tõhustada mutatsioonitestimise protsessi. PMT treenib klassifitseerimismudeli, kasutades selleks muteeritud koodil ja testikomplektil põhinevaid tunnuseid. Treenitud mudeliga ennustatakse, kas mutant tapetakse või jääb ellu, mutanti ennast testikomplekti vastu jooksutamata.Antud lähenemist katsetati mitme tarkvaraprojekti peal. Kaht Java keelel põhinevat projekti kasutati katsetamaks ennustavat mutatsioonitestimist kahes erinevas olukorras: üle mitme projekti ja üle mitme versiooni. C-keelel põhinevat tarkvaraprojekti kasutati uurimaks, kas ennustavat mutatsioonitestimist saab rakendada ka teistel tehnoloogiatel põhinevatel projektidel. Katsetulemused näitavad, et ennustav mutatsioonitestimine suudab ennustada mutantide ellujäämist või tapmist kõrge täpsusega. Java projektidel saadi tulemuseks üle 0.90 ROC-AUC väärtused ja väiksemad kui 10% ennustusvea väärtused. C projektil saadi tulemuseks üle 0.90 ROC-AUC väärtus ja väiksema kui 1% ennustusvea väärtuse. Üldiselt on näidatud, et ennustav mutatsioonitestimine töötab hästi erinevatel tehnoloogiatel ja tuleb toime ka andmetes esinevate ebavõrdsete klasside suurustega.Mutation Testing is a powerful technique for evaluating the quality of a test suite. During evaluation a large number of mutants is generated and executed against the test suite. The percentage of killed mutants indicates the strength of the test suite. The main idea behind this is to see if test cases are robust enough to detect mutated code. Mutation Testing is an extremely costly and time-consuming technique since each mutant needs to be executed against the test suite. For this reason, this paper investigates Predictive Mutation Testing (PMT) technique to make Mutation Testing more efficient. PMT constructs a classification model based on the features related to the mutated code and the test suite and uses the model to predict execution results of a mutant without actually executing it. The model predicts if a mutant will be killed or it will survive. This approach has been evaluated on several projects. Two Java projects were used to assess PMT under two application scenarios: cross-project and cross-version. C project was also used to explore if PMT can be applied to a different technology. PMT has been evaluated using only one version of a C project. The experimental results demonstrate that PMT is able to predict execution results of mutants with high accuracy. On Java projects it achieves above 0.90 ROC-AUC values and less than 10% Prediction Error values. On the C project it achieves above 0.90 ROC-AUC value and less than 1% Prediction Error value. Overall, PMT is shown to perform well on different technologies and be robust when dealing with imbalanced data

DSpace at Tartu University Library

Is the Stack Distance Between Test Case and Method Correlated With Test Effectiveness?

Author: Acree Allen Troy
Chawla Nitesh V
Jefferson Offutt A
Ji Changbin
Kohavi Ron
Marko Ivanković Goran Petrović
Niedermayr Rainer
Schuler David
Strug Joanna
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 13/03/2019
Field of study

Mutation testing is a means to assess the effectiveness of a test suite and its outcome is considered more meaningful than code coverage metrics. However, despite several optimizations, mutation testing requires a significant computational effort and has not been widely adopted in industry. Therefore, we study in this paper whether test effectiveness can be approximated using a more light-weight approach. We hypothesize that a test case is more likely to detect faults in methods that are close to the test case on the call stack than in methods that the test case accesses indirectly through many other methods. Based on this hypothesis, we propose the minimal stack distance between test case and method as a new test measure, which expresses how close any test case comes to a given method, and study its correlation with test effectiveness. We conducted an empirical study with 21 open-source projects, which comprise in total 1.8 million LOC, and show that a correlation exists between stack distance and test effectiveness. The correlation reaches a strength up to 0.58. We further show that a classifier using the minimal stack distance along with additional easily computable measures can predict the mutation testing result of a method with 92.9% precision and 93.4% recall. Hence, such a classifier can be taken into consideration as a light-weight alternative to mutation testing or as a preceding, less costly step to that.Comment: EASE 201

arXiv.org e-Print Archive

Crossref

Faster Mutation Analysis via Equivalence Modulo States

Author: Hao Dan
Shi Yangqingwei
Wang Bo
Xiong Yingfei
Zhang Lu
Publication venue
Publication date: 22/02/2017
Field of study

Mutation analysis has many applications, such as asserting the quality of test suites and localizing faults. One important bottleneck of mutation analysis is scalability. The latest work explores the possibility of reducing the redundant execution via split-stream execution. However, split-stream execution is only able to remove redundant execution before the first mutated statement. In this paper we try to also reduce some of the redundant execution after the execution of the first mutated statement. We observe that, although many mutated statements are not equivalent, the execution result of those mutated statements may still be equivalent to the result of the original statement. In other words, the statements are equivalent modulo the current state. In this paper we propose a fast mutation analysis approach, AccMut. AccMut automatically detects the equivalence modulo states among a statement and its mutations, then groups the statements into equivalence classes modulo states, and uses only one process to represent each class. In this way, we can significantly reduce the number of split processes. Our experiments show that our approach can further accelerate mutation analysis on top of split-stream execution with a speedup of 2.56x on average.Comment: Submitted to conferenc

arXiv.org e-Print Archive

Crossref

Nationwide evaluation of mutation-tailored treatment of gastrointestinal stromal tumors in daily clinical practice

Author: Gelderblom Hans
Grunberg Katrien
Ho Vincent K. Y.
Ligtenberg Marjolijn J. L.
Steeghs Elisabeth M. P.
Voorham Quirinus J. M.
Willems Stefan M.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

Background Molecular analysis of KIT and PDGFRA is critical for tyrosine kinase inhibitor treatment selection of gastrointestinal stromal tumors (GISTs) and hence recommended by international guidelines. We performed a nationwide study into the application of predictive mutation testing in GIST patients and its impact on targeted treatment decisions in clinical practice. Methods Real-world clinical and pathology information was obtained from GIST patients with initial diagnosis in 2017-2018 through database linkage between the Netherlands Cancer Registry and the nationwide Dutch Pathology Registry. Results Predictive mutation analysis was performed in 89% of the patients with high risk or metastatic disease. Molecular testing rates were higher for patients treated in expertise centers (96%) compared to non-expertise centers (75%, P < 0.01). Imatinib therapy was applied in 81% of the patients with high risk or metastatic disease without patient's refusal or adverse characteristics, e.g., comorbidities or resistance mutations. Mutation analysis that was performed in 97% of these imatinib-treated cases, did not guarantee mutation-tailored treatment: 2% of these patients had the PDGFRA p.D842V resistance mutation and 7% initiated imatinib therapy at the normal instead of high dose despite of having a KIT exon 9 mutation. Conclusion In conclusion, nationwide real-world data show that over 81% of the eligible high risk or metastatic disease patients receive targeted therapy, which was tailored to the mutation status as recommended in guidelines in 88% of cases. Therefore, still 27% of these GIST patients misses out on mutation-tailored treatment. The reasons for suboptimal uptake of testing and treatment require further study

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Leiden University Scholary Publications

Radboud Repository

Dissertations of the University of Groningen

Selecting fault revealing mutants

Author: Bissyande Tegawendé François D Assise
Le Traon Yves
Papadakis Mike
Sen Koushik
Titcheu Chekam Thierry
Publication venue
Publication date: 18/12/2019
Field of study

Mutant selection refers to the problem of choosing, among a large number of mutants, the (few) ones that should be used by the testers. In view of this, we investigate the problem of selecting the fault revealing mutants, i.e., the mutants that are killable and lead to test cases that uncover unknown program faults. We formulate two variants of this problem: the fault revealing mutant selection and the fault revealing mutant prioritization. We argue and show that these problems can be tackled through a set of ‘static’ program features and propose a machine learning approach, named FaRM, that learns to select and rank killable and fault revealing mutants. Experimental results involving 1,692 real faults show the practical benefits of our approach in both examined problems. Our results show that FaRM achieves a good trade-off between application cost and effectiveness (measured in terms of faults revealed). We also show that FaRM outperforms all the existing mutant selection methods, i.e., the random mutant sampling, the selective mutation and defect prediction (mutating the code areas pointed by defect prediction). In particular, our results show that with respect to mutant selection, our approach reveals 23% to 34% more faults than any of the baseline methods, while, with respect to mutant prioritization, it achieves higher average percentage of revealed faults with a median difference between 4% and 9% (from the random mutant orderings)

Open Repository and Bibliography - Luxembourg