Search CORE

20 research outputs found

Evaluating testing methods by delivered reliability

Author: Frankl P. G.
Hamlet R. G.
Littlewood B.
Strigini L.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1998
Field of study

There are two main goals in testing software: (1) to achieve adequate quality (debug testing), where the objective is to probe the software for defects so that these can be removed, and (2) to assess existing quality (operational testing), where the objective is to gain confidence that the software is reliable. Debug methods tend to ignore random selection of test data from an operational profile, while for operational methods this selection is all-important. Debug methods are thought to be good at uncovering defects so that these can be repaired, but having done so they do not provide a technically defensible assessment of the reliability that results. On the other hand, operational methods provide accurate assessment, but may not be as useful for achieving reliability. This paper examines the relationship between the two testing goals, using a probabilistic analysis. We define simple models of programs and their testing, and try to answer the question of how to attain program reliability: is it better to test by probing for defects as in debug testing, or to assess reliability directly as in operational testing? Testing methods are compared in a model where program failures are detected and the software changed to eliminate them. The “better” method delivers higher reliability after all test failures have been eliminated. Special cases are exhibited in which each kind of testing is superior. An analysis of the distribution of the delivered reliability indicates that even simple models have unusual statistical properties, suggesting caution in interpreting theoretical comparisons

CiteSeerX

City Research Online

Robust Dynamic Selection of Tested Modules in Software Testing for Maximizing Delivered Reliability

Author: Cai Kai-Yuan
Cao Ping
Dong Zhao
Liu Ke
Publication venue
Publication date: 16/01/2017
Field of study

Software testing is aimed to improve the delivered reliability of the users. Delivered reliability is the reliability of using the software after it is delivered to the users. Usually the software consists of many modules. Thus, the delivered reliability is dependent on the operational profile which specifies how the users will use these modules as well as the defect number remaining in each module. Therefore, a good testing policy should take the operational profile into account and dynamically select tested modules according to the current state of the software during the testing process. This paper discusses how to dynamically select tested modules in order to maximize delivered reliability by formulating the selection problem as a dynamic programming problem. As the testing process is performed only once, risk must be considered during the testing process, which is described by the tester's utility function in this paper. Besides, since usually the tester has no accurate estimate of the operational profile, by employing robust optimization technique, we analysis the selection problem in the worst case, given the uncertainty set of operational profile. By numerical examples, we show the necessity of maximizing delivered reliability directly and using robust optimization technique when the tester has no clear idea of the operational profile. Moreover, it is shown that the risk averse behavior of the tester has a major influence on the delivered reliability.Comment: 19 pages, 4 figure

arXiv.org e-Print Archive

CiteSeerX

Recommended from our members

Comparing the effectiveness of testing methods in improving programs: the effect of variations in program quality

Author: Pizza M.
Strigini L.
Publication venue
Publication date: 01/01/1998
Field of study

We compare the efficacy of different testing methods for improving the reliability of software. Specifically, we use modelling to compare “operational” testing, in which test cases are chosen according to their probability of occurring in actual use of the software, against “debug” testing methods, in which the testers look for test cases which they consider likely to cause failure, or that satisfy some coverage criterion. We base our comparisons on the reliability reached by the program at the end of testing. Differently from previous studies, we consider the probability distribution of the achieved reliability, and thus the probability of satisfying specific requirements, rather than just the average reliability achieved. We take account of two sources of variation. The variation between the actual test histories that are possible for a given program and a given test method: and the fact that different programs start testing with different faults and initial reliability levels. By necessity, we use very simplified models of reality. Yet, we can show some interesting conclusions with important practical consequences. In general, there are stronger arguments in favor of operational testing than previous studies have show

City Research Online

Temporal Modeling of Software Test Coverage

Author: Ghafoor Arif
Paul Raymond A.
Sedigh Sahra
Publication venue: Scholars\u27 Mine
Publication date: 01/08/2002
Field of study

This paper presents a temporal model for the coverage achieved by software testing. The proposed model, which is applicable at any level of the testing hierarchy, can determine the value of test coverage at any given time, as well as predicting future values. The model is comprised of two main components: coverage functions, and the coverage matrix. The coverage functions represent the coverage of a single entity as a function of time and reflect the test environment through their stochastic parameters. The coverage matrix utilizes the coverage functions to depict the coverage attained for each entity by each test within the test suite. A normalized sum of the elements of the coverage matrix is used to represent the overall coverage achieved by the test suite, as a function of time. The application of the model to multi-phase testing is illustrated In the application section, test coverage values from Y2K compliance testing are used to verify model predictions

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

Recommended from our members

Fault tolerance via diversity for off-the-shelf products: A study with SQL database servers

Author: Gashi I.
Popov P. T.
Strigini L.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2007
Field of study

If an off-the-shelf software product exhibits poor dependability due to design faults, then software fault tolerance is often the only way available to users and system integrators to alleviate the problem. Thanks to low acquisition costs, even using multiple versions of software in a parallel architecture, which is a scheme formerly reserved for few and highly critical applications, may become viable for many applications. We have studied the potential dependability gains from these solutions for off-the-shelf database servers. We based the study on the bug reports available for four off-the-shelf SQL servers plus later releases of two of them. We found that many of these faults cause systematic noncrash failures, which is a category ignored by most studies and standard implementations of fault tolerance for databases. Our observations suggest that diverse redundancy would be effective for tolerating design faults in this category of products. Only in very few cases would demands that triggered a bug in one server cause failures in another one, and there were no coincident failures in more than two of the servers. Use of different releases of the same product would also tolerate a significant fraction of the faults. We report our results and discuss their implications, the architectural options available for exploiting them, and the difficulties that they may present

City Research Online

Crossref

Test adequacy assessment using test-defect coverage analytic model

Author: Haron NH
McBride T
Syed-Mohamad SM
Publication venue
Publication date: 01/01/2017
Field of study

Software testing is an essential activity in software development process that has been widely used as a means of achieving software reliability and quality. The emergence of incremental development in its various forms required a different approach to determining the readiness of the software for release. This approach needs to determine how reliable the software is likely to be based on planned tests, not defect growth and decline as typically shown in reliability growth models. A combination of information from a number of sources into an easily understood dashboard is expected to provide both qualitative and quantitative analyses of test and defect coverage properties. Hence, Test-Defect Coverage Analytic Model (TDCAM) is proposed which combines test and defect coverage information presented in a dashboard to help deciding whether there are enough tests planned. A case study has been conducted to demonstrate the usage of the proposed model. The visual representations and results gained from the case study show the benefits of TDCAM in assisting practitioners making informed test adequacy-related decisions

OPUS - University of Technology Sydney

Application of a failure driven test profile in random testing

Author: Chen T
Kuo F
Liu H
Publication venue: IEEE (Piscataway, NJ, USA)
Publication date: 01/01/2009
Field of study

Random testing techniques have been extensively used in reliability assessment, as well as in debug testing. When used to assess software reliability, random testing selects test cases based on an operational profile; while in the context of debug testing, random testing often uses a uniform distribution. However, generally neither an operational profile nor a uniform distribution is chosen from the perspective of maximizing the effectiveness of failure detection. Adaptive random testing has been proposed to enhance the failure detection capability of random testing by evenly spreading test cases over the whole input domain. In this paper, we propose a new test profile, which is different from both the uniform distribution, and operational profiles. The aim of the new test profile is to maximize the effectiveness of failure detection. We integrate this new test profile with some existing adaptive random testing algorithms, and develop a family of new random testing algorithms. These new algorithms not only distribute test cases more evenly, but also have better failure detection capabilities than the corresponding original adaptive random testing algorithms. As a consequence, they perform better than the pure random testing

RMIT Research Repository

Victoria University Eprints Repository

Swinburne Research Bank

Boosting Operational DNN Testing Efficiency through Conditioning

Author: Cao Chun
Li Zenan
Lü Jian
Ma Xiaoxing
Xu Chang
Xu Jingwei
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 27/06/2019
Field of study

With the increasing adoption of Deep Neural Network (DNN) models as integral parts of software systems, efficient operational testing of DNNs is much in demand to ensure these models' actual performance in field conditions. A challenge is that the testing often needs to produce precise results with a very limited budget for labeling data collected in field. Viewing software testing as a practice of reliability estimation through statistical sampling, we re-interpret the idea behind conventional structural coverages as conditioning for variance reduction. With this insight we propose an efficient DNN testing method based on the conditioning on the representation learned by the DNN model under testing. The representation is defined by the probability distribution of the output of neurons in the last hidden layer of the model. To sample from this high dimensional distribution in which the operational data are sparsely distributed, we design an algorithm leveraging cross entropy minimization. Experiments with various DNN models and datasets were conducted to evaluate the general efficiency of the approach. The results show that, compared with simple random sampling, this approach requires only about a half of labeled inputs to achieve the same level of precision.Comment: Published in the Proceedings of the 27th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2019

arXiv.org e-Print Archive

Crossref

Recommended from our members

The reliability of diverse systems: a contribution using modelling of the fault creation process

Author: Popov P. T.
Strigini L.
Publication venue
Publication date: 01/01/2001
Field of study

Design diversity is a defence against design faults causing common-mode failure in redundant systems, but we badly lack knowledge about how much reliability it will buy in practice, and thus about its cost-effectiveness, the situations in which it is an appropriate solution and how it should be taken into account by assessors and safety regulators. Both current practice and the scientific debate about design diversity depend largely on intuition. More formal probabilistic reasoning would facilitate critical discussion and empirical validation of any predictions: to this aim, we propose a model of the generation of faults and failures in two separately-developed program versions. We show results on: (i) what degree of reliability improvement an assessor can reliably expect from diversity; and (ii) how this reliability improvement may change with higher-quality development processes. We discuss the practical relevance of these results and the degree to which they can be trusted

City Research Online

Proportional sampling strategy: A compendium and some insights

Author: Chen TY
Tse TH
Yu YT
Publication venue: 'Elsevier BV'
Publication date: 01/01/2001
Field of study

There have been numerous studies on the effectiveness of partition and random testing. In particular, the proportional sampling (PS) strategy has been proved, under certain conditions, to be the only form of partition testing that outperforms random testing regardless of where the failure-causing inputs are. This paper provides an integrated synthesis and overview of our recent studies on the PS strategy and its related work. Through this synthesis, we offer a perspective that properly interprets the results obtained so far, and present some of the interesting issues involved and new insights obtained during the course of this research. © 2001 Elsevier Science Inc. All rights reserved.postprin

HKU Scholars Hub