12 research outputs found

    Exploring a Better Search–based Implementation on Second–Order Mutant Generation

    Get PDF
    Pengujian perangkat lunak adalah bagian dari proses pengembangan perangkat lunak, dengan tujuan utama untuk mengurangi/menghilangkan kesalahan pada perangkat lunak, hal ini umumnya dilakukan dengan menjalankan kasus-uji. Salah satu teknik untuk mengukur dan meningkatkan kualitas dari kasus uji adalah pengujian mutasi, tetapi walaupun sudah terbukti keefektifannya, teknik ini masih memiliki suatu kendala besar, yaitu tidak praktis untuk digunakan karena melibatkan pembangkitan dan eksekusi dari jumlah mutan yang besar. Belakangan ini penggunaan optimisasi berbasis-pencarian pada permasalahan pengujian perangkat lunak sedang popular. Pada penelitian ini, dilakukan eksplorasi penggunaan optimasi berbasis-pencarian pada pembangkitan mutan (variasi dari program), dengan tujuan untuk menghasilkan mutan yang tidak dapat dideteksi oleh kasus-uji, karena mutan jenis ini memiliki dapat kekurangan dari kasus-uji. Metode usulan dibandingkan dengan algoritma pembangkitan second-order mutant yang umum digunakan, dan juga dibandingkan dengan pendekatan berbasis pencarian lainnya. Hasil menunjukkan bahwa metode usulan dapat membangkitkan lebih banyak mutan tidak-terdeteksi (undetected-mutant) daripada dengan metode pembangkitan mutan yang umum. Metode usulan memiliki performansi yang lebih rendah daripada metode pembangkitan berbasis-pencarian benchmark, tetapi performansinya dapat ditingkatkan dengan melakukan perubahan pada representasi solusi, dan dengan adopsi parameter optimasi yang digunakan oleh metode pembanding. Software testing is a part of a software development process with a major concern is to reduce/eliminate fault in the software, and mainly done by executing a test case. One of the techniques for measuring and improving test case quality is mutation testing, but despite it is good effectiveness, this technique has a major problem that is impractical because it involves generation and execution of huge amount of mutant. This trend also happens in software testing, with the main focus on optimizing the test case generation. In this research, we explore the used of search-based optimization to the mutant (program variant) generation, with a goal to generate mutants that can escape test case detection, because these mutants have a probability to show test case deficiency. In this research, the proposed method is compared with a general second-order mutant generation algorithm and with other search-based mutant generation. The result shows that the proposed method can generate more undetected-mutant than a general second-order mutant generation. The proposed method performs worse than the benchmark search-based mutant generation, but this performance improved by altering it is solution representation and by the adoption of an optimization parameter

    Assessing the influence of multiple test case selection on mutation experiments

    Get PDF
    Mutation testing is widely used in experiments.\ud Some papers experiment with mutation directly, while others\ud use it to introduce faults to measure the effectiveness of tests\ud created by other methods. There is some random variation in the\ud mutation score depending on the specific test values used. When\ud generating tests to use in experiments, a common, although not\ud universal practice, is to generate multiple sets of tests to satisfy\ud the same criterion or according to the same procedure, and then\ud to compute their average performance. Averaging over multiple\ud test sets is thought to reduce the variation in the mutation score.\ud This practice is extremely expensive when tests are generated by\ud hand (as is common) and as the number of programs increase (a\ud current positive trend in software engineering experimentation).\ud The research reported in this short paper asks a simple\ud and direct question: do we need to generate multiple sets of\ud test cases? That is, how do different test sets influence the\ud cost and effectiveness results? In a controlled experiment, we\ud generated 10 different test cases to be adequate for the Statement\ud Deletion (SSDL) mutation operator for 39 small programs and\ud functions, and then evaluated how they differ in terms of cost and\ud effectiveness. We found that averaging over multiple programs\ud was effective in reducing the variance in the mutation scores\ud introduced by specific testsFAPESP (número processo 2012/16950-5

    Evaluation of Mutation Testing in a Nuclear Industry Case Study

    Get PDF
    For software quality assurance, many safety-critical industries appeal to the use of dynamic testing and structural coverage criteria. However, there are reasons to doubt the adequacy of such practices. Mutation testing has been suggested as an alternative or complementary approach but its cost has traditionally hindered its adoption by industry, and there are limited studies applying it to real safety-critical code. This paper evaluates the effectiveness of state-of-the-art mutation testing on safety-critical code from within the U.K. nuclear industry, in terms of revealing flaws in test suites that already meet the structural coverage criteria recommended by relevant safety standards. It also assesses the practical feasibility of implementing such mutation testing in a real setting. We applied a conventional selective mutation approach to a C codebase supplied by a nuclear industry partner and measured the mutation score achieved by the existing test suite. We repeated the experiment using trivial compiler equivalence (TCE) to assess the benefit that it might provide. Using a conventional approach, it first appeared that the existing test suite only killed 82% of the mutants, but applying TCE revealed that it killed 92%. The difference was due to equivalent or duplicate mutants that TCE eliminated. We then added new tests to kill all the surviving mutants, increasing the test suite size by 18% in the process. In conclusion, mutation testing can potentially improve fault detection compared to structural-coverage-guided testing, and may be affordable in a nuclear industry context. The industry feedback on our results was positive, although further evidence is needed from application of mutation testing to software with known real faults

    A Fault-Based Model of Fault Localization Techniques

    Get PDF
    Every day, ordinary people depend on software working properly. We take it for granted; from banking software, to railroad switching software, to flight control software, to software that controls medical devices such as pacemakers or even gas pumps, our lives are touched by software that we expect to work. It is well known that the main technique/activity used to ensure the quality of software is testing. Often it is the only quality assurance activity undertaken, making it that much more important. In a typical experiment studying these techniques, a researcher will intentionally seed a fault (intentionally breaking the functionality of some source code) with the hopes that the automated techniques under study will be able to identify the fault\u27s location in the source code. These faults are picked arbitrarily; there is potential for bias in the selection of the faults. Previous researchers have established an ontology for understanding or expressing this bias called fault size. This research captures the fault size ontology in the form of a probabilistic model. The results of applying this model to measure fault size suggest that many faults generated through program mutation (the systematic replacement of source code operators to create faults) are very large and easily found. Secondary measures generated in the assessment of the model suggest a new static analysis method, called testability, for predicting the likelihood that code will contain a fault in the future. While software testing researchers are not statisticians, they nonetheless make extensive use of statistics in their experiments to assess fault localization techniques. Researchers often select their statistical techniques without justification. This is a very worrisome situation because it can lead to incorrect conclusions about the significance of research. This research introduces an algorithm, MeansTest, which helps automate some aspects of the selection of appropriate statistical techniques. The results of an evaluation of MeansTest suggest that MeansTest performs well relative to its peers. This research then surveys recent work in software testing using MeansTest to evaluate the significance of researchers\u27 work. The results of the survey indicate that software testing researchers are underreporting the significance of their work

    Establishing theoretical minimal sets of mutants

    Get PDF
    Mutation analysis generates tests that distinguish\ud variations, or mutants, of an artifact from the original. Mutation\ud analysis is widely considered to be a powerful approach to testing,\ud and hence is often used to evaluate other test criteria in terms of\ud mutation score, which is the fraction of mutants that are killed\ud by a test set. But mutation analysis is also known to provide\ud large numbers of redundant mutants, and these mutants can\ud inflate the mutation score. While mutation approaches broadly\ud characterized as reduced mutation try to eliminate redundant\ud mutants, the literature lacks a theoretical result that articulates\ud just how many mutants are needed in any given situation. Hence,\ud there is, at present, no way to characterize the contribution\ud of, for example, a particular approach to reduced mutation\ud with respect to any theoretical minimal set of mutants. This\ud paper’s contribution is to provide such a theoretical foundation\ud for mutant set minimization. The central theoretical result of the\ud paper shows how to minimize efficiently mutant sets with respect\ud to a set of test cases. We evaluate our method with a widely-used\ud benchmark.FAPESP (número processo 2012/16950-5

    Experimental evaluation of SDL and One-Op mutation for C

    Get PDF
    Mutation analysis modifies a program by applying\ud syntactic rules, called mutation operators, systematically to create\ud many versions of the program (mutants) that differ in small\ud ways. Testers then design tests to cause the mutants to behave\ud differently from the original program. Mutation testing is widely\ud considered to result in very effective tests, however, it is also quite\ud costly. Cost comes from the many mutants that are created, the\ud number of tests that are needed to kill the mutants, and the\ud difficulty of deciding whether mutants behave equivalently to\ud the original program. One-op mutation theorizes that cost can be\ud reduced by using a single, very powerful, mutation operator that\ud leads to tests that are almost as effective as if all operators are\ud used. Previous research proposed the statement deletion operator\ud (SDL) and found promising results. This paper investigates the\ud use of SDL-mutation in a new context, the language C, and poses\ud additional empirical questions, including whether other operators\ud can be used. We carried out a controlled experiment in which\ud cost and effectiveness of each individual C mutation operator\ud were collected for 39 different subject programs. Experimental\ud data are used to define a cost-effectiveness metric to choose the\ud best single operator for one-op mutation.FAPESP (número processo 2012/16950-5

    Designing deletion mutation operators

    Get PDF
    Mutation analysis modifies a program by applying\ud syntactic rules, called mutation operators, systematically to create\ud many versions of the program (mutants) that differ in small\ud ways. Testers then design tests to cause the mutants to behave\ud differently from the original program. Mutation testing is widely\ud considered to result in very effective tests, however, it is also quite\ud costly. Cost comes from the many mutants that are created, the\ud number of tests that are needed to kill the mutants, and the\ud difficulty of deciding whether mutants behave equivalently to\ud the original program. One-op mutation theorizes that cost can be\ud reduced by using a single, very powerful, mutation operator that\ud leads to tests that are almost as effective as if all operators are\ud used. Previous research proposed the statement deletion operator\ud (SDL) and found promising results. This paper investigates the\ud use of SDL-mutation in a new context, the language C, and poses\ud additional empirical questions, including whether other operators\ud can be used. We carried out a controlled experiment in which\ud cost and effectiveness of each individual C mutation operator\ud were collected for 39 different subject programs. Experimental\ud data are used to define a cost-effectiveness metric to choose the\ud best single operator for one-op mutation.FAPESP (número processo 2012/16950-5

    Selecting fault revealing mutants

    Get PDF
    Mutant selection refers to the problem of choosing, among a large number of mutants, the (few) ones that should be used by the testers. In view of this, we investigate the problem of selecting the fault revealing mutants, i.e., the mutants that are killable and lead to test cases that uncover unknown program faults. We formulate two variants of this problem: the fault revealing mutant selection and the fault revealing mutant prioritization. We argue and show that these problems can be tackled through a set of ‘static’ program features and propose a machine learning approach, named FaRM, that learns to select and rank killable and fault revealing mutants. Experimental results involving 1,692 real faults show the practical benefits of our approach in both examined problems. Our results show that FaRM achieves a good trade-off between application cost and effectiveness (measured in terms of faults revealed). We also show that FaRM outperforms all the existing mutant selection methods, i.e., the random mutant sampling, the selective mutation and defect prediction (mutating the code areas pointed by defect prediction). In particular, our results show that with respect to mutant selection, our approach reveals 23% to 34% more faults than any of the baseline methods, while, with respect to mutant prioritization, it achieves higher average percentage of revealed faults with a median difference between 4% and 9% (from the random mutant orderings)

    Selecting fault revealing mutants

    Get PDF
    Mutant selection refers to the problem of choosing, among a large number of mutants, the (few) ones that should be used by the testers. In view of this, we investigate the problem of selecting the fault revealing mutants, i.e., the mutants that are killable and lead to test cases that uncover unknown program faults. We formulate two variants of this problem: the fault revealing mutant selection and the fault revealing mutant prioritization. We argue and show that these problems can be tackled through a set of ‘static’ program features and propose a machine learning approach, named FaRM, that learns to select and rank killable and fault revealing mutants. Experimental results involving 1,692 real faults show the practical benefits of our approach in both examined problems. Our results show that FaRM achieves a good trade-off between application cost and effectiveness (measured in terms of faults revealed). We also show that FaRM outperforms all the existing mutant selection methods, i.e., the random mutant sampling, the selective mutation and defect prediction (mutating the code areas pointed by defect prediction). In particular, our results show that with respect to mutant selection, our approach reveals 23% to 34% more faults than any of the baseline methods, while, with respect to mutant prioritization, it achieves higher average percentage of revealed faults with a median difference between 4% and 9% (from the random mutant orderings)
    corecore