10 research outputs found

    Test oracle assessment and improvement

    Get PDF
    We introduce a technique for assessing and improving test oracles by reducing the incidence of both false positives and false negatives. We prove that our approach can always result in an increase in the mutual information between the actual and perfect oracles. Our technique combines test case generation to reveal false positives and mutation testing to reveal false negatives. We applied the decision support tool that implements our oracle improvement technique to five real-world subjects. The experimental results show that the fault detection rate of the oracles after improvement increases, on average, by 48.6% (86% over the implicit oracle). Three actual, exposed faults in the studied systems were subsequently confirmed and fixed by the developers

    Automated blackbox GUI specifications enhancement and test data generation

    Get PDF
    Applications with a Graphical User Interface (GUI) front-end are ubiquitous nowadays. While automated model-based approaches have been shown to be effective in testing of such applications, most existing techniques produce many infeasible event sequences used as GUI test cases. This happens primarily because the behavioral specifications of the GUI under test are ignored. In this dissertation we present an automated framework that reveals an important set of state-based constraints among GUI events based on infeasible (i.e., unexecutable or partially executable) test cases of a GUI test suite. GUIDiVa, an iterative algorithm at the core of our framework, enumerates all possible constraint violations as potential reasons for test case failure, on the failed event of an infeasible test case. It then selects and adds the most promising constraints of each iteration to a final set based on the Validity Weight of constraints. The results of empirical studies on both seeded and nine non-trivial open-source study subjects show that our framework is capable of capturing important aspects of GUI behavior in the form of state-based event constraints, while considerably reducing the number of insfeasible test cases. The second part of this dissertation deals with the problem of automatic generation of relevant test data for parameterized GUI events (i.e., events associated with widgets that accept user inputs such as textboxes and textareas). Current techniques either manipulate the source code of the application under test (AUT) to generate the test data, or blindly use a set of random string values. We propose a novel way to generate the test data by exploiting the information provided in the GUI structure to extract a set of key identifiers for each parameterized GUI widget. These identifiers are used to compose appropriate online search phrases and collect relevant test data from the Internet. The results of an empirical study on five GUI-based applications show that the proposed approach is applicable and results in execution of some hard-to-cover branches in the subject programs. The proposed technique works from a black-box perspective and is entirely independent from GUI modeling and event sequence generation, thus it does not require source code access and offers the possibility of being integrated with existing GUI testing frameworks

    Automated Test Case Generation as a Many-Objective Optimisation Problem with Dynamic Selection of the Targets

    Get PDF
    The test case generation is intrinsically a multi-objective problem, since the goal is covering multiple test targets (e.g., branches). Existing search-based approaches either consider one target at a time or aggregate all targets into a single fitness function (whole-suite approach). Multi and many-objective optimisation algorithms (MOAs) have never been applied to this problem, because existing algorithms do not scale to the number of coverage objectives that are typically found in real-world software. In addition, the final goal for MOAs is to find alternative trade-off solutions in the objective space, while in test generation the interesting solutions are only those test cases covering one or more uncovered targets. In this paper, we present DynaMOSA (Dynamic Many-Objective Sorting Algorithm), a novel many-objective solver specifically designed to address the test case generation problem in the context of coverage testing. DynaMOSA extends our previous many-objective technique MOSA (Many-Objective Sorting Algorithm) with dynamic selection of the coverage targets based on the control dependency hierarchy. Such extension makes the approach more effective and efficient in case of limited search budget. We carried out an empirical study on 346 Java classes using three coverage criteria (i.e., statement, branch, and strong mutation coverage) to assess the performance of DynaMOSA with respect to the whole-suite approach (WS), its archive-based variant (WSA) and MOSA. The results show that DynaMOSA outperforms WSA in 28% of the classes for branch coverage (+8% more coverage on average) and in 27% of the classes for mutation coverage (+11% more killed mutants on average). It outperforms WS in 51% of the classes for statement coverage, leading to +11% more coverage on average. Moreover, DynaMOSA outperforms its predecessor MOSA for all the three coverage criteria in 19% of the classes with +8% more code coverage on average

    Oracle Assessment, Improvement and Placement

    Get PDF
    The oracle problem remains one of the key challenges in software testing, for which little automated support has been developed so far. This thesis analyses the prevalence of failed error propagation in programs with real faults to address the oracle placement problem and introduces an approach for iterative assessment and improvement of the oracles. To analyse failed error propagation in programs with real faults, we have conducted an empirical study, considering Defects4J, a benchmark of Java programs, of which we used all 6 projects available, 384 real bugs and 528 methods fixed to correct such bugs. The results indicate that the prevalence of failed error propagation is negligible. Moreover, the results on real faults differ from the results on mutants, indicating that if failed error propagation is taken into account, mutants are not a good surrogate of real faults. When measuring failed error propagation, for each method we use the strongest possible oracle as postcondition, which checks all externally observable program variables. The low prevalence of failed error propagation is caused by the presence of such a strong oracle, which usually is not available in practice. Therefore, there is a need for a technique to assess and improve existing weaker oracles. We propose a technique for assessing and improving test oracles, which necessarily places the human tester in the loop and is based on reducing the incidence of both false positives and false negatives. A proof showing that this approach results in an increase in the mutual information between the actual and perfect oracles is provided. The application of the approach to five real-world subjects shows that the fault detection rate of the oracles after improvement increases, on average, by 48.6%. The further evaluation with 39 participants assessed the ability of humans to detect false positives and false negatives manually, without any tool support. The correct classification rate achieved by humans in this case is poor (29%) indicating how helpful our automated approach can be for developers. The comparison of humans’ ability to improve oracles with and without the tool in a study with 29 other participants also empirically validates the effectiveness of the approach

    Multi-objective Search-based Mobile Testing

    Get PDF
    Despite the tremendous popularity of mobile applications, mobile testing still relies heavily on manual testing. This thesis presents mobile test automation approaches based on multi-objective search. We introduce three approaches: Sapienz (for native Android app testing), Octopuz (for hybrid/web JavaScript app testing) and Polariz (for using crowdsourcing to support search-based mobile testing). These three approaches represent the primary scientific and technical contributions of the thesis. Since crowdsourcing is, itself, an emerging research area, and less well understood than search-based software engineering, the thesis also provides the first comprehensive survey on the use of crowdsourcing in software testing (in particular) and in software engineering (more generally). This survey represents a secondary contribution. Sapienz is an approach to Android testing that uses multi-objective search-based testing to automatically explore and optimise test sequences, minimising their length, while simultaneously maximising their coverage and fault revelation. The results of empirical studies demonstrate that Sapienz significantly outperforms both the state-of-the-art technique Dynodroid and the widely-used tool, Android Monkey, on all three objectives. When applied to the top 1,000 Google Play apps, Sapienz found 558 unique, previously unknown crashes. Octopuz reuses the Sapienz multi-objective search approach for automated JavaScript testing, aiming to investigate whether it replicates the Sapienz’ success on JavaScript testing. Experimental results on 10 real-world JavaScript apps provide evidence that Octopuz significantly outperforms the state of the art (and current state of practice) in automated JavaScript testing. Polariz is an approach that combines human (crowd) intelligence with machine (computational search) intelligence for mobile testing. It uses a platform that enables crowdsourced mobile testing from any source of app, via any terminal client, and by any crowd of workers. It generates replicable test scripts based on manual test traces produced by the crowd workforce, and automatically extracts from these test traces, motif events that can be used to improve search-based mobile testing approaches such as Sapienz

    Exploring means to facilitate software debugging

    Get PDF
    In this thesis, several aspects of software debugging from automated crash reproduction to bug report analysis and use of contracts have been studied.Algorithms and the Foundations of Software technolog
    corecore