3,075 research outputs found
Visualizing test diversity to support test optimisation
Diversity has been used as an effective criteria to optimise test suites for
cost-effective testing. Particularly, diversity-based (alternatively referred
to as similarity-based) techniques have the benefit of being generic and
applicable across different Systems Under Test (SUT), and have been used to
automatically select or prioritise large sets of test cases. However, it is a
challenge to feedback diversity information to developers and testers since
results are typically many-dimensional. Furthermore, the generality of
diversity-based approaches makes it harder to choose when and where to apply
them. In this paper we address these challenges by investigating: i) what are
the trade-off in using different sources of diversity (e.g., diversity of test
requirements or test scripts) to optimise large test suites, and ii) how
visualisation of test diversity data can assist testers for test optimisation
and improvement. We perform a case study on three industrial projects and
present quantitative results on the fault detection capabilities and redundancy
levels of different sets of test cases. Our key result is that test similarity
maps, based on pair-wise diversity calculations, helped industrial
practitioners identify issues with their test repositories and decide on
actions to improve. We conclude that the visualisation of diversity information
can assist testers in their maintenance and optimisation activities
Search algorithms for regression test case prioritization
Regression testing is an expensive, but important, process. Unfortunately, there may be insufficient resources to allow for the re-execution of all test cases during regression testing. In this situation, test case prioritisation techniques aim to improve the effectiveness of regression testing, by ordering the test cases so that the most beneficial are executed first. Previous work on regression test case prioritisation has focused on Greedy Algorithms. However, it is known that these algorithms may produce sub-optimal results, because they may construct results that denote only local minima within the search space. By contrast, meta-heuristic and evolutionary search algorithms aim to avoid such problems. This paper presents results from an empirical study of the application of several greedy, meta-heuristic and evolutionary search algorithms to six programs, ranging from 374 to 11,148 lines of code for 3 choices of fitness metric. The paper addresses the problems of choice of fitness metric, characterisation of landscape modality and determination of the most suitable search technique to apply. The empirical results replicate previous results concerning Greedy Algorithms. They shed light on the nature of the regression testing search space, indicating that it is multi-modal. The results also show that Genetic Algorithms perform well, although Greedy approaches are surprisingly effective, given the multi-modal nature of the landscape
Model-Based Security Testing
Security testing aims at validating software system requirements related to
security properties like confidentiality, integrity, authentication,
authorization, availability, and non-repudiation. Although security testing
techniques are available for many years, there has been little approaches that
allow for specification of test cases at a higher level of abstraction, for
enabling guidance on test identification and specification as well as for
automated test generation.
Model-based security testing (MBST) is a relatively new field and especially
dedicated to the systematic and efficient specification and documentation of
security test objectives, security test cases and test suites, as well as to
their automated or semi-automated generation. In particular, the combination of
security modelling and test generation approaches is still a challenge in
research and of high interest for industrial applications. MBST includes e.g.
security functional testing, model-based fuzzing, risk- and threat-oriented
testing, and the usage of security test patterns. This paper provides a survey
on MBST techniques and the related models as well as samples of new methods and
tools that are under development in the European ITEA2-project DIAMONDS.Comment: In Proceedings MBT 2012, arXiv:1202.582
Prioritizing MCDC test cases by spectral analysis of Boolean functions
Test case prioritization aims at scheduling test cases in an order that improves some performance goal. One performance goal is a measure of how quickly faults are detected. Such prioritization can be performed by exploiting the Fault Exposing Potential (FEP) parameters associated to the test cases. FEP is usually approximated by mutation analysis under certain fault assumptions. Although this technique is effective, it could be relatively expensive compared to the other prioritization techniques. This study proposes a cost-effective FEP approximation for prioritizing Modified Condition Decision Coverage (MCDC) test cases. A strict negative correlation between the FEP of a MCDC test case and the influence value of the associated input condition allows to order the test cases easily without the need of an extensive mutation analysis. The
method is entirely based on mathematics and it provides useful insight into how spectral analysis of Boolean functions can benefit software testing
Automatic Repair of Real Bugs: An Experience Report on the Defects4J Dataset
Defects4J is a large, peer-reviewed, structured dataset of real-world Java
bugs. Each bug in Defects4J is provided with a test suite and at least one
failing test case that triggers the bug. In this paper, we report on an
experiment to explore the effectiveness of automatic repair on Defects4J. The
result of our experiment shows that 47 bugs of the Defects4J dataset can be
automatically repaired by state-of- the-art repair. This sets a baseline for
future research on automatic repair for Java. We have manually analyzed 84
different patches to assess their real correctness. In total, 9 real Java bugs
can be correctly fixed with test-suite based repair. This analysis shows that
test-suite based repair suffers from under-specified bugs, for which trivial
and incorrect patches still pass the test suite. With respect to practical
applicability, it takes in average 14.8 minutes to find a patch. The experiment
was done on a scientific grid, totaling 17.6 days of computation time. All
their systems and experimental results are publicly available on Github in
order to facilitate future research on automatic repair
- …