77,203 research outputs found

    Test Set Diameter: Quantifying the Diversity of Sets of Test Cases

    Full text link
    A common and natural intuition among software testers is that test cases need to differ if a software system is to be tested properly and its quality ensured. Consequently, much research has gone into formulating distance measures for how test cases, their inputs and/or their outputs differ. However, common to these proposals is that they are data type specific and/or calculate the diversity only between pairs of test inputs, traces or outputs. We propose a new metric to measure the diversity of sets of tests: the test set diameter (TSDm). It extends our earlier, pairwise test diversity metrics based on recent advances in information theory regarding the calculation of the normalized compression distance (NCD) for multisets. An advantage is that TSDm can be applied regardless of data type and on any test-related information, not only the test inputs. A downside is the increased computational time compared to competing approaches. Our experiments on four different systems show that the test set diameter can help select test sets with higher structural and fault coverage than random selection even when only applied to test inputs. This can enable early test design and selection, prior to even having a software system to test, and complement other types of test automation and analysis. We argue that this quantification of test set diversity creates a number of opportunities to better understand software quality and provides practical ways to increase it.Comment: In submissio

    Optimizing compilation with preservation of structural code coverage metrics to support software testing

    Get PDF
    Code-coverage-based testing is a widely-used testing strategy with the aim of providing a meaningful decision criterion for the adequacy of a test suite. Code-coverage-based testing is also mandated for the development of safety-critical applications; for example, the DO178b document requires the application of the modified condition/decision coverage. One critical issue of code-coverage testing is that structural code coverage criteria are typically applied to source code whereas the generated machine code may result in a different code structure because of code optimizations performed by a compiler. In this work, we present the automatic calculation of coverage profiles describing which structural code-coverage criteria are preserved by which code optimization, independently of the concrete test suite. These coverage profiles allow to easily extend compilers with the feature of preserving any given code-coverage criteria by enabling only those code optimizations that preserve it. Furthermore, we describe the integration of these coverage profile into the compiler GCC. With these coverage profiles, we answer the question of how much code optimization is possible without compromising the error-detection likelihood of a given test suite. Experimental results conclude that the performance cost to achieve preservation of structural code coverage in GCC is rather low.Peer reviewedSubmitted Versio

    Predicting regression test failures using genetic algorithm-selected dynamic performance analysis metrics

    Get PDF
    A novel framework for predicting regression test failures is proposed. The basic principle embodied in the framework is to use performance analysis tools to capture the runtime behaviour of a program as it executes each test in a regression suite. The performance information is then used to build a dynamically predictive model of test outcomes. Our framework is evaluated using a genetic algorithm for dynamic metric selection in combination with state-of-the-art machine learning classifiers. We show that if a program is modified and some tests subsequently fail, then it is possible to predict with considerable accuracy which of the remaining tests will also fail which can be used to help prioritise tests in time constrained testing environments

    Analyzing the test process using structural coverage

    Get PDF
    A large, commercially developed FORTRAN program was modified to produce structural coverage metrics. The modified program was executed on a set of functionally generated acceptance tests and a large sample of operational usage cases. The resulting structural coverage metrics are combined with fault and error data to evaluate structural coverage. It was shown that in the software environment the functionally generated tests seem to be a good approximation of operational use. The relative proportions of the exercised statement subclasses change as the structural coverage of the program increases. A method was also proposed for evaluating if two sets of input data exercise a program in a similar manner. Evidence was provided that implies that in this environment, faults revealed in a procedure are independent of the number of times the procedure is executed and that it may be reasonable to use procedure coverage in software models that use statement coverage. Finally, the evidence suggests that it may be possible to use structural coverage to aid in the management of the acceptance test processed

    Performance Evaluation and Optimization of Math-Similarity Search

    Full text link
    Similarity search in math is to find mathematical expressions that are similar to a user's query. We conceptualized the similarity factors between mathematical expressions, and proposed an approach to math similarity search (MSS) by defining metrics based on those similarity factors [11]. Our preliminary implementation indicated the advantage of MSS compared to non-similarity based search. In order to more effectively and efficiently search similar math expressions, MSS is further optimized. This paper focuses on performance evaluation and optimization of MSS. Our results show that the proposed optimization process significantly improved the performance of MSS with respect to both relevance ranking and recall.Comment: 15 pages, 8 figure

    Software component testing : a standard and the effectiveness of techniques

    Get PDF
    This portfolio comprises two projects linked by the theme of software component testing, which is also often referred to as module or unit testing. One project covers its standardisation, while the other considers the analysis and evaluation of the application of selected testing techniques to an existing avionics system. The evaluation is based on empirical data obtained from fault reports relating to the avionics system. The standardisation project is based on the development of the BC BSI Software Component Testing Standard and the BCS/BSI Glossary of terms used in software testing, which are both included in the portfolio. The papers included for this project consider both those issues concerned with the adopted development process and the resolution of technical matters concerning the definition of the testing techniques and their associated measures. The test effectiveness project documents a retrospective analysis of an operational avionics system to determine the relative effectiveness of several software component testing techniques. The methodology differs from that used in other test effectiveness experiments in that it considers every possible set of inputs that are required to satisfy a testing technique rather than arbitrarily chosen values from within this set. The three papers present the experimental methodology used, intermediate results from a failure analysis of the studied system, and the test effectiveness results for ten testing techniques, definitions for which were taken from the BCS BSI Software Component Testing Standard. The creation of the two standards has filled a gap in both the national and international software testing standards arenas. Their production required an in-depth knowledge of software component testing techniques, the identification and use of a development process, and the negotiation of the standardisation process at a national level. The knowledge gained during this process has been disseminated by the author in the papers included as part of this portfolio. The investigation of test effectiveness has introduced a new methodology for determining the test effectiveness of software component testing techniques by means of a retrospective analysis and so provided a new set of data that can be added to the body of empirical data on software component testing effectiveness

    Driver-pressure-impact and response-recovery chains in European rivers: observed and predicted effects on BQEs

    Get PDF
    The report presented in the following is part of the outcome of WISER’s river Workpackage WP5.1 and as such part of the module on aquatic ecosystem management and restoration. The ultimate goal of WP5.1 is to provide guidance on best practice restoration and management to the practitioners in River Basin Management. Therefore, a series of analyses was undertaken, each of which used a part of the WP5.1 database in order to track two major pathways of biological response: 1) the response of riverine biota to environmental pressures (degradation) and 2) the response of biota to the reduction of these impacts (restoration). This report attempts to provide empirical evidence on the environment-biota relationships for both pathways
    corecore