49,018 research outputs found

    Fault Detection Effectiveness of Metamorphic Relations Developed for Testing Supervised Classifiers

    Full text link
    In machine learning, supervised classifiers are used to obtain predictions for unlabeled data by inferring prediction functions using labeled data. Supervised classifiers are widely applied in domains such as computational biology, computational physics and healthcare to make critical decisions. However, it is often hard to test supervised classifiers since the expected answers are unknown. This is commonly known as the \emph{oracle problem} and metamorphic testing (MT) has been used to test such programs. In MT, metamorphic relations (MRs) are developed from intrinsic characteristics of the software under test (SUT). These MRs are used to generate test data and to verify the correctness of the test results without the presence of a test oracle. Effectiveness of MT heavily depends on the MRs used for testing. In this paper we have conducted an extensive empirical study to evaluate the fault detection effectiveness of MRs that have been used in multiple previous studies to test supervised classifiers. Our study uses a total of 709 reachable mutants generated by multiple mutation engines and uses data sets with varying characteristics to test the SUT. Our results reveal that only 14.8\% of these mutants are detected using the MRs and that the fault detection effectiveness of these MRs do not scale with the increased number of mutants when compared to what was reported in previous studies.Comment: 8 pages, AITesting 201

    Mutation testing on an object-oriented framework: An experience report

    Get PDF
    This is the preprint version of the article - Copyright @ 2011 ElsevierContext The increasing presence of Object-Oriented (OO) programs in industrial systems is progressively drawing the attention of mutation researchers toward this paradigm. However, while the number of research contributions in this topic is plentiful, the number of empirical results is still marginal and mostly provided by researchers rather than practitioners. Objective This article reports our experience using mutation testing to measure the effectiveness of an automated test data generator from a user perspective. Method In our study, we applied both traditional and class-level mutation operators to FaMa, an open source Java framework currently being used for research and commercial purposes. We also compared and contrasted our results with the data obtained from some motivating faults found in the literature and two real tools for the analysis of feature models, FaMa and SPLOT. Results Our results are summarized in a number of lessons learned supporting previous isolated results as well as new findings that hopefully will motivate further research in the field. Conclusion We conclude that mutation testing is an effective and affordable technique to measure the effectiveness of test mechanisms in OO systems. We found, however, several practical limitations in current tool support that should be addressed to facilitate the work of testers. We also missed specific techniques and tools to apply mutation testing at the system level.This work has been partially supported by the European Commission (FEDER) and Spanish Government under CICYT Project SETI (TIN2009-07366) and the Andalusian Government Projects ISABEL (TIC-2533) and THEOS (TIC-5906)

    Automated metamorphic testing on the analyses of feature models

    Get PDF
    Copyright Ā© 2010 Elsevier B.V. All rights reserved.Context: A feature model (FM) represents the valid combinations of features in a domain. The automated extraction of information from FMs is a complex task that involves numerous analysis operations, techniques and tools. Current testing methods in this context are manual and rely on the ability of the tester to decide whether the output of an analysis is correct. However, this is acknowledged to be time-consuming, error-prone and in most cases infeasible due to the combinatorial complexity of the analyses, this is known as the oracle problem.Objective: In this paper, we propose using metamorphic testing to automate the generation of test data for feature model analysis tools overcoming the oracle problem. An automated test data generator is presented and evaluated to show the feasibility of our approach.Method: We present a set of relations (so-called metamorphic relations) between input FMs and the set of products they represent. Based on these relations and given a FM and its known set of products, a set of neighbouring FMs together with their corresponding set of products are automatically generated and used for testing multiple analyses. Complex FMs representing millions of products can be efficiently created by applying this process iteratively.Results: Our evaluation results using mutation testing and real faults reveal that most faults can be automatically detected within a few seconds. Two defects were found in FaMa and another two in SPLOT, two real tools for the automated analysis of feature models. Also, we show how our generator outperforms a related manual suite for the automated analysis of feature models and how this suite can be used to guide the automated generation of test cases obtaining important gains in efficiency.Conclusion: Our results show that the application of metamorphic testing in the domain of automated analysis of feature models is efficient and effective in detecting most faults in a few seconds without the need for a human oracle.This work has been partially supported by the European Commission(FEDER)and Spanish Government under CICYT project SETI(TIN2009-07366)and the Andalusian Government project ISABEL(TIC-2533)

    Automatically Discovering, Reporting and Reproducing Android Application Crashes

    Full text link
    Mobile developers face unique challenges when detecting and reporting crashes in apps due to their prevailing GUI event-driven nature and additional sources of inputs (e.g., sensor readings). To support developers in these tasks, we introduce a novel, automated approach called CRASHSCOPE. This tool explores a given Android app using systematic input generation, according to several strategies informed by static and dynamic analyses, with the intrinsic goal of triggering crashes. When a crash is detected, CRASHSCOPE generates an augmented crash report containing screenshots, detailed crash reproduction steps, the captured exception stack trace, and a fully replayable script that automatically reproduces the crash on a target device(s). We evaluated CRASHSCOPE's effectiveness in discovering crashes as compared to five state-of-the-art Android input generation tools on 61 applications. The results demonstrate that CRASHSCOPE performs about as well as current tools for detecting crashes and provides more detailed fault information. Additionally, in a study analyzing eight real-world Android app crashes, we found that CRASHSCOPE's reports are easily readable and allow for reliable reproduction of crashes by presenting more explicit information than human written reports.Comment: 12 pages, in Proceedings of 9th IEEE International Conference on Software Testing, Verification and Validation (ICST'16), Chicago, IL, April 10-15, 2016, pp. 33-4

    Software component testing : a standard and the effectiveness of techniques

    Get PDF
    This portfolio comprises two projects linked by the theme of software component testing, which is also often referred to as module or unit testing. One project covers its standardisation, while the other considers the analysis and evaluation of the application of selected testing techniques to an existing avionics system. The evaluation is based on empirical data obtained from fault reports relating to the avionics system. The standardisation project is based on the development of the BC BSI Software Component Testing Standard and the BCS/BSI Glossary of terms used in software testing, which are both included in the portfolio. The papers included for this project consider both those issues concerned with the adopted development process and the resolution of technical matters concerning the definition of the testing techniques and their associated measures. The test effectiveness project documents a retrospective analysis of an operational avionics system to determine the relative effectiveness of several software component testing techniques. The methodology differs from that used in other test effectiveness experiments in that it considers every possible set of inputs that are required to satisfy a testing technique rather than arbitrarily chosen values from within this set. The three papers present the experimental methodology used, intermediate results from a failure analysis of the studied system, and the test effectiveness results for ten testing techniques, definitions for which were taken from the BCS BSI Software Component Testing Standard. The creation of the two standards has filled a gap in both the national and international software testing standards arenas. Their production required an in-depth knowledge of software component testing techniques, the identification and use of a development process, and the negotiation of the standardisation process at a national level. The knowledge gained during this process has been disseminated by the author in the papers included as part of this portfolio. The investigation of test effectiveness has introduced a new methodology for determining the test effectiveness of software component testing techniques by means of a retrospective analysis and so provided a new set of data that can be added to the body of empirical data on software component testing effectiveness
    • ā€¦
    corecore