325,886 research outputs found

    Redefining and Evaluating Coverage Criteria Based on the Testing Scope

    Get PDF
    Test coverage information can help testers in deciding when to stop testing and in augmenting their test suites when the measured coverage is not deemed sufficient. Since the notion of a test criterion was introduced in the 70’s, research on coverage testing has been very active with much effort dedicated to the definition of new, more cost-effective, coverage criteria or to the adaptation of existing ones to a different domain. All these studies share the premise that after defining the entity to be covered (e.g., branches), one cannot consider a program to be adequately tested if some of its entities have never been exercised by any input data. However, it is not the case that all entities are of interest in every context. This is particularly true for several paradigms that emerged in the last decade (e.g., component-based development, service-oriented architecture). In such cases, traditional coverage metrics might not always provide meaningful information. In this thesis we address such situation and we redefine coverage criteria so to focus on the program parts that are relevant to the testing scope. We instantiate this general notion of scope-based coverage by introducing three coverage criteria and we demonstrate how they could be applied to different testing contexts. When applied to the context of software reuse, our approach proved to be useful for supporting test case prioritization, selection and minimization. Our studies showed that for prioritization we can improve the average rate of faults detected. For test case selection and minimization, we can considerably reduce the test suite size with small to no extra impact on fault detection effectiveness. When the source code is not available, such as in the service-oriented architecture paradigm, we propose an approach that customizes coverage, measured on invocations at service interface, based on data from similar users. We applied this approach to a real world application and, in our study, we were able to predict the entities that would be of interest for a given user with high precision. Finally, we introduce the first of its kind coverage criterion for operational profile based testing that exploits program spectra obtained from usage traces. Our study showed that it is better correlated than traditional coverage with the probability that the next test input will fail, which implies that our approach can provide a better stopping rule. Promising results were also observed for test case selection. Our redefinition of coverage criteria approaches the topic of coverage testing from a completely different angle. Such a novel perspective paves the way for new avenues of research towards improving the cost-effectiveness of testing, yet all to be explored

    A fault syndromes simulator for random access memories

    Get PDF
    Testing and diagnosis techniques play a key role in the advance of semiconductor memory technology. The challenge of failure detection has created intensive investigation on efficient testing and diagnosis algorithm for better fault coverage and diagnostic resolution. At present, the March test algorithm is used to detect and diagnose all faults related to Random Access Memories. This algorithm also allows the faults to be located and identified. However, the test and diagnosis process is mainly done manually. Due to this, a systematic approach for developing and evaluating memory test algorithm is required. This work is focused on incorporating the March based test algorithm using a software simulator tool for implementing a fast and systematic memory testing algorithm. The simulator allows a user through a GUI to select a March based test algorithm depending on the desired fault coverage and diagnostic resolution. Experimental results show that using the simulator for testing is more efficient than that of the traditional testing algorithm. This new simulator makes it possible for a detailed list of coupling faults and stuck-at faults covered by each algorithm and its percentage to be displayed after a set of test algorithms has been chosen. The percentage of diagnostic resolution is also displayed. This proves that the simulator reduces the trade-off between test time, fault coverage and diagnostic resolution. Moreover, the chosen algorithm can be applied to incorporate with memory built-in self-test and diagnosis, to have a better fault coverage and diagnostic resolution. Universities and industry involved in memory Built-in-Self test, Built-in-Self repair and Built-in-Self diagnose will benefit by saving a few years on finding an efficient algorithm to be implemented in their designs

    Scaling up HIV self-testing in sub-Saharan Africa: a review of technology, policy and evidence.

    Get PDF
    PURPOSE OF REVIEW: HIV self-testing (HIVST) can provide complementary coverage to existing HIV testing services and improve knowledge of status among HIV-infected individuals. This review summarizes the current technology, policy and evidence landscape in sub-Saharan Africa and priorities within a rapidly evolving field. RECENT FINDINGS: HIVST is moving towards scaled implementation, with the release of WHO guidelines, WHO prequalification of the first HIVST product, price reductions of HIVST products and a growing product pipeline. Multicountry evidence from southern and eastern Africa confirms high feasibility, acceptability and accuracy across many delivery models and populations, with minimal harms. Evidence on the effectiveness of HIVST on increased testing coverage is strong, while evidence on demand generation for follow-on HIV prevention and treatment services and cost-effective delivery is emerging. Despite these developments, HIVST delivery remains limited outside of pilot implementation. SUMMARY: Important technology gaps include increasing availability of more sensitive HIVST products in low and middle-income countries. Regulatory and postmarket surveillance systems for HIVST also require further development. Randomized trials evaluating the effectiveness and cost-effectiveness under multiple distribution models, including unrestricted delivery and with a focus on linkage to HIV prevention and treatment, remain priorities. Diversification of studies from west and central Africa and around blood-based products should be addressed

    Guiding Deep Learning System Testing using Surprise Adequacy

    Full text link
    Deep Learning (DL) systems are rapidly being adopted in safety and security critical domains, urgently calling for ways to test their correctness and robustness. Testing of DL systems has traditionally relied on manual collection and labelling of data. Recently, a number of coverage criteria based on neuron activation values have been proposed. These criteria essentially count the number of neurons whose activation during the execution of a DL system satisfied certain properties, such as being above predefined thresholds. However, existing coverage criteria are not sufficiently fine grained to capture subtle behaviours exhibited by DL systems. Moreover, evaluations have focused on showing correlation between adversarial examples and proposed criteria rather than evaluating and guiding their use for actual testing of DL systems. We propose a novel test adequacy criterion for testing of DL systems, called Surprise Adequacy for Deep Learning Systems (SADL), which is based on the behaviour of DL systems with respect to their training data. We measure the surprise of an input as the difference in DL system's behaviour between the input and the training data (i.e., what was learnt during training), and subsequently develop this as an adequacy criterion: a good test input should be sufficiently but not overtly surprising compared to training data. Empirical evaluation using a range of DL systems from simple image classifiers to autonomous driving car platforms shows that systematic sampling of inputs based on their surprise can improve classification accuracy of DL systems against adversarial examples by up to 77.5% via retraining

    Branch-coverage testability transformation for unstructured programs

    Get PDF
    Test data generation by hand is a tedious, expensive and error-prone activity, yet testing is a vital part of the development process. Several techniques have been proposed to automate the generation of test data, but all of these are hindered by the presence of unstructured control flow. This paper addresses the problem using testability transformation. Testability transformation does not preserve the traditional meaning of the program, rather it deals with preserving test-adequate sets of input data. This requires new equivalence relations which, in turn, entail novel proof obligations. The paper illustrates this using the branch coverage adequacy criterion and develops a branch adequacy equivalence relation and a testability transformation for restructuring. It then presents a proof that the transformation preserves branch adequacy

    Data generator for evaluating ETL process quality

    Get PDF
    Obtaining the right set of data for evaluating the fulfillment of different quality factors in the extract-transform-load (ETL) process design is rather challenging. First, the real data might be out of reach due to different privacy constraints, while manually providing a synthetic set of data is known as a labor-intensive task that needs to take various combinations of process parameters into account. More importantly, having a single dataset usually does not represent the evolution of data throughout the complete process lifespan, hence missing the plethora of possible test cases. To facilitate such demanding task, in this paper we propose an automatic data generator (i.e., Bijoux). Starting from a given ETL process model, Bijoux extracts the semantics of data transformations, analyzes the constraints they imply over input data, and automatically generates testing datasets. Bijoux is highly modular and configurable to enable end-users to generate datasets for a variety of interesting test scenarios (e.g., evaluating specific parts of an input ETL process design, with different input dataset sizes, different distributions of data, and different operation selectivities). We have developed a running prototype that implements the functionality of our data generation framework and here we report our experimental findings showing the effectiveness and scalability of our approach.Peer ReviewedPostprint (author's final draft

    Automatic evaluation of generation and parsing for machine translation with automatically acquired transfer rules

    Get PDF
    This paper presents a new method of evaluation for generation and parsing components of transfer-based MT systems where the transfer rules have been automatically acquired from parsed sentence-aligned bitext corpora. The method provides a means of quantifying the upper bound imposed on the MT system by the quality of the parsing and generation technologies for the target language. We include experiments to calculate this upper bound for both handcrafted and automatically induced parsing and generation technologies currently in use by transfer-based MT systems

    Evaluation and optimization of frequent association rule based classification

    Get PDF
    Deriving useful and interesting rules from a data mining system is an essential and important task. Problems such as the discovery of random and coincidental patterns or patterns with no significant values, and the generation of a large volume of rules from a database commonly occur. Works on sustaining the interestingness of rules generated by data mining algorithms are actively and constantly being examined and developed. In this paper, a systematic way to evaluate the association rules discovered from frequent itemset mining algorithms, combining common data mining and statistical interestingness measures, and outline an appropriated sequence of usage is presented. The experiments are performed using a number of real-world datasets that represent diverse characteristics of data/items, and detailed evaluation of rule sets is provided. Empirical results show that with a proper combination of data mining and statistical analysis, the framework is capable of eliminating a large number of non-significant, redundant and contradictive rules while preserving relatively valuable high accuracy and coverage rules when used in the classification problem. Moreover, the results reveal the important characteristics of mining frequent itemsets, and the impact of confidence measure for the classification task
    corecore