13 research outputs found
Automatic Software Test Data Generation for Spanning Sets Coverage Using Genetic Algorithms
Software testing takes a considerable amount of time and resources spent on producing software. Therefore, it would be useful to have ways to reduce the cost of software testing. The new concepts of spanning sets of entities suggested by Marré and Bertolino are useful for reducing the cost of testing. In fact, to reduce the testing effort, the generation of test data can be targeted to cover the entities in the spanning set, rather than all the entities in the tested program. Marré and Bertolino presented an algorithm based on the subsumption relation between entities to find spanning sets for a family of control flow and data flow-based test coverage criteria. This paper presents a new general technique for the automatic test data generation for spanning sets coverage. The proposed technique applies to the algorithm proposed recently by Marré and Bertolino to automatically generate the spanning sets of program entities that satisfy a wide range of control flow and data flow-based test coverage criteria. Then, it uses a genetic algorithm to automatically generate sets of test data to cover these spanning sets. The proposed technique employed the concepts of spanning sets to limit the number of test cases, guide the test case selection, overcome the problem of the redundant test cases and automate the test path generation
Infrastructure Support for Controlled Experimentation with Software Testing and Regression Testing Techniques
Where the development, understanding, and assessment of software testing and regression testing techniques are concerned, controlled experimentation is an indispensable research methodology. Obtaining the infrastructure necessary to support rigorous controlled experimentation with testing techniques, however, is difficult and expensive. As a result, progress in experimentation with testing techniques has been slow, and empirical data on the costs and effectiveness of testing techniques remains relatively scarce. To help address this problem, we have been designing and constructing infrastructure to support controlled experimentation with software testing and regression testing techniques. This paper reports on the challenges faced by researchers experimenting with testing techniques, including those that inform the design of our infrastructure. The paper then describes the infrastructure that we are creating in response to these challenges, and that we are now making available to other researchers, and discusses the impact that this infrastructure has and can be expected to have on controlled experimentation with testing techniques
FAST Approaches to Scalable Similarity-based Test Case Prioritization
Many test case prioritization criteria have been proposed for speeding up fault detection. Among them, similarity-based approaches give priority to the test cases that are the most dissimilar from those already selected. However, the proposed criteria do not scale up to handle the many thousands or even some millions test suite sizes of modern industrial systems and simple heuristics are used instead. We introduce the FAST family of test case prioritization techniques that radically changes this landscape by borrowing algorithms commonly exploited in the big data domain to find similar items. FAST techniques provide scalable similarity-based test case prioritization in both white-box and black-box fashion. The results from experimentation on real world C and Java subjects show that the fastest members of the family outperform other black-box approaches in efficiency with no significant impact on effectiveness, and also outperform white-box approaches, including greedy ones, if preparation time is not counted. A simulation study of scalability shows that one FAST technique can prioritize a million test cases in less than 20 minutes
Mutant reduction based on dominance relation for weak mutation testing
Context: As a fault-based testing technique, mutation testing is effective at evaluating the quality of existing test suites. However, a large number of mutants result in the high computational cost in mutation testing. As a result, mutant reduction is of great importance to improve the efficiency of mutation testing. Objective: We aim to reduce mutants for weak mutation testing based on the dominance relation between mutant branches. Method: In our method, a new program is formed by inserting mutant branches into the original program. By analyzing the dominance relation between mutant branches in the new program, the non-dominated one is obtained, and the mutant corresponding to the non-dominated mutant branch is the mutant after reduction. Results: The proposed method is applied to test ten benchmark programs and six classes from open-source projects. The experimental results show that our method reduces over 80% mutants on average, which greatly improves the efficiency of mutation testing. Conclusion: We conclude that dominance relation between mutant branches is very important and useful in reducing mutants for mutation testing
Revisiting the Relationship Between Fault Detection,Test Adequacy Criteria, and Test Set Size
The research community has long recognized a complex interrelationship between test set size, test adequacy criteria, and test effectiveness in terms of fault detection. However, there is substantial confusion about the role and importance of controlling for test set size when assessing and comparing test adequacy criteria. This paper makes the following contributions: (1) A review of contradictory analyses of the relationship between fault detection, test suite size, and test adequacy criteria. Specifically, this paper addresses the supposed contradiction of prior work and explains why test suite size is neither a confounding variable, as previously suggested,nor an independent variable that should be experimentally manipulated. (2) An explication and discussion of the experimental design and sampling strategies of prior work, together with a discussion of conceptual and statistical problems, and specific guidelines for future work. (3) A methodology for comparing test-adequacy criteria on an equal basis, which accounts for test suite size by treating it as a covariate. (4) An empirical evaluation that compares the effectiveness of coverage-based and mutation-based testing to one another and random testing. Additionally, this paper proposes probabilistic coupling, a methodology for approximating the representativeness of a set of test goals for a given set of real fault
Ensuring interoperability between network elements in next generation networks
Next Generation Networks (NGNs), based on the Internet Protocol (IP), implement
several services such as IP-based telephony and are beginning to replace the classic telephony
systems. Due to the development and implementation of new powerful services
these systems are becoming increasingly complex.
Implementing these new services (typically software-based network elements) is often
accompanied by unexpected and erratic behaviours which can manifest as interoperability
problems. The reason for this caused by insufficient testing at the developing
companies. The testing of such products is by nature a costly and time-consuming
exercise and therefore cut down to what is considered the maximum acceptable level.
Ensuring the interoperability between network elements is a known challenge. However,
there exists no concept of which testing methods should be utilised to achieve an
acceptable level of quality. The objective of this thesis was to improve the interoperability
between network elements in NGNs by creating a testing scheme comprising of
three diverse testing methods: conformance testing, interoperability testing and posthoc
analysis.
In the first project a novel conformance testing methodology for developing sets of conformance
test cases for service specifications in NGNs was proposed. This methodology significantly improves the chance of interoperability and provides a considerable enhancement to the currently used interoperability tests. It was evaluated by successfully
applying it to the Presence Service.
The second report proposed a post-hoc methodology which enables the identification
of the ultimate causes for interoperability problems in a NGN in daily operation. The
new methods were implemented in the tool IMPACT (IP-Based Multi Protocol Posthoc
Analyzer and Conformance Tester), which stores all exchanged messages between
network elements in a database. Using SQL queries, the causes for errors can be found
efficiently.
Overall the presented testing scheme improves significantly the chance that network
elements interoperate successfully by providing new methods. Beyond that, the quality
of the software product is raised by mapping these methods to phases in a process model
and providing well defined steps on which test method is the best suited at a certain
stage
Software Batch Testing to Reduce Build Test Executions
Testing is expensive and batching tests have the potential to reduce test costs. The continuous integration strategy of testing each commit or change individually helps to quickly identify faults but leads to a maximum number of test executions. Large companies that have a large number of commits, e.g. Google and Facebook, or have expensive test infrastructure, e.g. Ericsson, must batch changes together to reduce the number of total test runs. For example, if eight builds are batched together and there is no failure, then we have tested eight builds with one execution saving seven executions. However, when a failure occurs it is not immediately clear which build is the cause of the failure. A bisection is run to isolate the failing build, i.e. the culprit build. In our eight builds example, a failure will require an additional 6 executions, resulting in a saving of one execution.
The goal of this work is to improve the efficiency of the batch testing. We evaluate six approaches. The first is the baseline approach that tests each build individually. The second, is the existing bisection approach. The third uses a batch size of four, which we show mathematically reduces the number of execution without requiring bisection. The fourth combines the two prior techniques introducing a stopping condition to the bisection. The final two approaches use models
of build change risk to isolate risky changes and test them in smaller batches.
We evaluate the approaches on nine open source projects that use Travis CI. Compared to the TestAll baseline, on average, the approaches reduce the number of build test executions across projects by 46%, 48%, 50%, 44%, and 49% for BatchBisect, Batch4, BatchStop4, RiskTopN, and RiskBatch, respectively. The greatest reduction is BatchStop4 at 50%. However, the simple approach of Batch4 does not require bisection and achieves a reduction of 48%. We recommend that
all CI pipelines use a batch size of at least four. We release our scripts and data for replication.
Regardless of the approach, on average, we save around half the build test executions compared to testing each change individually. We release the BatchBuilder tool that automatically batches submitted changes on GitHub for testing on Travis CI. Since the tool reports individual results for each pull-request or pushed commit, the batching happens in the background and the development process is unchanged