184 research outputs found

    Fault Detection Effectiveness of Metamorphic Relations Developed for Testing Supervised Classifiers

    Full text link
    In machine learning, supervised classifiers are used to obtain predictions for unlabeled data by inferring prediction functions using labeled data. Supervised classifiers are widely applied in domains such as computational biology, computational physics and healthcare to make critical decisions. However, it is often hard to test supervised classifiers since the expected answers are unknown. This is commonly known as the \emph{oracle problem} and metamorphic testing (MT) has been used to test such programs. In MT, metamorphic relations (MRs) are developed from intrinsic characteristics of the software under test (SUT). These MRs are used to generate test data and to verify the correctness of the test results without the presence of a test oracle. Effectiveness of MT heavily depends on the MRs used for testing. In this paper we have conducted an extensive empirical study to evaluate the fault detection effectiveness of MRs that have been used in multiple previous studies to test supervised classifiers. Our study uses a total of 709 reachable mutants generated by multiple mutation engines and uses data sets with varying characteristics to test the SUT. Our results reveal that only 14.8\% of these mutants are detected using the MRs and that the fault detection effectiveness of these MRs do not scale with the increased number of mutants when compared to what was reported in previous studies.Comment: 8 pages, AITesting 201

    Mutation testing on an object-oriented framework: An experience report

    Get PDF
    This is the preprint version of the article - Copyright @ 2011 ElsevierContext The increasing presence of Object-Oriented (OO) programs in industrial systems is progressively drawing the attention of mutation researchers toward this paradigm. However, while the number of research contributions in this topic is plentiful, the number of empirical results is still marginal and mostly provided by researchers rather than practitioners. Objective This article reports our experience using mutation testing to measure the effectiveness of an automated test data generator from a user perspective. Method In our study, we applied both traditional and class-level mutation operators to FaMa, an open source Java framework currently being used for research and commercial purposes. We also compared and contrasted our results with the data obtained from some motivating faults found in the literature and two real tools for the analysis of feature models, FaMa and SPLOT. Results Our results are summarized in a number of lessons learned supporting previous isolated results as well as new findings that hopefully will motivate further research in the field. Conclusion We conclude that mutation testing is an effective and affordable technique to measure the effectiveness of test mechanisms in OO systems. We found, however, several practical limitations in current tool support that should be addressed to facilitate the work of testers. We also missed specific techniques and tools to apply mutation testing at the system level.This work has been partially supported by the European Commission (FEDER) and Spanish Government under CICYT Project SETI (TIN2009-07366) and the Andalusian Government Projects ISABEL (TIC-2533) and THEOS (TIC-5906)

    LittleDarwin: a Feature-Rich and Extensible Mutation Testing Framework for Large and Complex Java Systems

    Full text link
    Mutation testing is a well-studied method for increasing the quality of a test suite. We designed LittleDarwin as a mutation testing framework able to cope with large and complex Java software systems, while still being easily extensible with new experimental components. LittleDarwin addresses two existing problems in the domain of mutation testing: having a tool able to work within an industrial setting, and yet, be open to extension for cutting edge techniques provided by academia. LittleDarwin already offers higher-order mutation, null type mutants, mutant sampling, manual mutation, and mutant subsumption analysis. There is no tool today available with all these features that is able to work with typical industrial software systems.Comment: Pre-proceedings of the 7th IPM International Conference on Fundamentals of Software Engineerin

    Jumble Java Byte Code to Measure the Effectiveness of Unit Tests

    Get PDF
    Jumble is a byte code level mutation testing tool for Java which inter-operates with JUnit. It has been designed to operate in an industrial setting with large projects. Heuristics have been included to speed the checking of mutations, for example, noting which test fails for each mutation and running this first in subsequent mutation checks. Significant effort has been put into ensuring that it can test code which uses custom class loading and reflection. This requires careful attention to class path handling and coexistence with foreign class-loaders. Jumble is currently used on a continuous basis within an agile programming environment with approximately 370,000 lines of Java code under source control. This checks out project code every fifteen minutes and runs an incremental set of unit tests and mutation tests for modified classes. Jumble is being made available as open source

    Program Repair by Stepwise Correctness Enhancement

    Full text link
    Relative correctness is the property of a program to be more-correct than another with respect to a given specification. Whereas the traditional definition of (absolute) correctness divides candidate program into two classes (correct, and incorrect), relative correctness arranges candidate programs on the richer structure of a partial ordering. In other venues we discuss the impact of relative correctness on program derivation, and on program verification. In this paper, we discuss the impact of relative correctness on program testing; specifically, we argue that when we remove a fault from a program, we ought to test the new program for relative correctness over the old program, rather than for absolute correctness. We present analytical arguments to support our position, as well as an empirical argument in the form of a small program whose faults are removed in a stepwise manner as its relative correctness rises with each fault removal until we obtain a correct program.Comment: In Proceedings PrePost 2016, arXiv:1605.0809

    Deviant: A Mutation Testing Tool for Solidity Smart Contracts

    Get PDF
    Blockchain in recent years has exploded in popularity with Ethereum being one of the leading blockchain platforms. Solidity is a widely used scripting language for creating smart contracts in Ethereum applications. Quality assurance in Solidity contracts is of critical importance because bugs or vulnerabilities can lead to a considerable loss of financial assets. However, it is unclear what level of quality assurance is provided in many of these applications. Mutation testing is the process of intentionally injecting faults into a target program and then running the provided test suite against the various injected faults. Mutation testing is used to evaluate the effectiveness of a test suite, measuring the test suite’s capability of covering certain types of faults. This thesis presents Deviant, the first implementation of a mutation testing tool for Solidity smart contracts. Deviant implements mutation operators that cover the unique features of Solidity according to our constructed fault model, in addition to traditional mutation operators that exist for other programming languages. Deviant has been applied to five open-source Solidity projects: MetaCoin [30], MultiSigWallet [31], Alice [29], aragonOS [32], and OpenZeppelin [33]. Experimental results show that the provided test suites result in low mutation scores. These results indicate that the provided tests cannot ensure high-level assurance of code quality. Such evaluation results offer important guidelines for Solidity developers to implement more effective tests in order to deliver trustworthy code and reduce the risk of financial loss

    Testing-based process for component substitutability

    Get PDF
    Software components have emerged to ease the assembly of software systems. However, updates of systems by substitution or upgrades of components demand careful management due to stability risks of deployed systems. Replacement components must be properly evaluated to identify if they provide the expected behaviour affected by substitution. To address this problem, this paper proposes a substitutability assessment process in which the regular compatibility analysis is complemented with the use of black-box testing criteria. The purpose is to observe the components' behaviour by analysing their internal functions of data transformation, which fulfils the observability testing metric. The approach is conceptually based on the technique Back-to-Back testing. When a component should be replaced, a specific Test Suite TS is built in order to represent its behavioural facets, viz. a Component Behaviour TS. This TS is later exercised on candidate upgrades or replacement components with the purpose of identifying the required compatibility. Automation of the process is supported through the testooj tool, which constrains the conditions and steps of the whole process in order to provide a rigorous and reliable approach.Fil: Flores, Andrés Pablo. Universidad Nacional del Comahue. Facultad de Informática. Departamento Ingeniería de Sistemas; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Patagonia Confluencia; ArgentinaFil: Polo, Macario. Universidad de Castilla-La Mancha; Españ

    A transformation-based approach to testing concurrent programs using UML activity diagrams

    Get PDF
    UML activity diagrams are widely used to model concurrent interaction among multiple objects. In this paper, we propose a transformation-based approach to generating scenario-oriented test cases for applications modeled by UML activity diagrams. Using a set of transformation rules, the proposed approach first transforms a UML activity diagram specification into an intermediate representation, from which it then constructs test scenarios with respect to the given concurrency coverage criteria. The approach then finally derives a set of test cases for the constructed test scenarios. The approach resolves the difficulties associated with fork and join concurrency in the UML activity diagram, and enables control over the number of the resulting test cases. We further implemented a tool to automate the proposed approach, and studied its feasibility and effectiveness using a case study. Experimental results show that the approach can generate test cases on demand to satisfy a given concurrency coverage criterion, and can detect up to 76.5% of seeded faults when a weak coverage criterion is used. With the approach, testers can not only schedule the software test process earlier, but can also better allocate the testing resources for testing concurrent applications

    Generating Test Data Based on Improved Uniform Design Strategy

    Get PDF
    AbstractThe difficulty of k-way combination strategy for designing test data for obtaining high coverage rate is the number of test data is too large to process it. To overcome it, the improved uniform design method is presented in this paper. The method contains two steps. The first is to generate test set based on the selected representative values for each input parameter and uniform design method. The second is to add some special test data into the test set based on the improved uniform design strategy. Experiments show that the improved uniform design strategy with all valid and invalid levels could be used to generate fewer test data than that of other selected strategy, and the test effects are appropriate to that of other strategy

    Using Constraints for Equivalent Mutant Detection

    Full text link
    In mutation testing the question whether a mutant is equivalent to its program is important in order to compute the correct mutation score. Unfortunately, answering this question is not always possible and can hardly be obtained just by having a look at the program's structure. In this paper we introduce a method for solving the equivalent mutant problem using a constraint representation of the program and its mutant. In particularly the approach is based on distinguishing test cases, i.e., test inputs that force the program and its mutant to behave in a different way. Beside the foundations of the approach, in this paper we also present the algorithms and first empirical results.Comment: In Proceedings WS-FMDS 2012, arXiv:1207.184
    corecore