12,414 research outputs found
Enablers and Impediments for Collaborative Research in Software Testing: An Empirical Exploration
When it comes to industrial organizations, current collaboration efforts in
software engineering research are very often kept in-house, depriving these
organizations off the skills necessary to build independent collaborative
research. The current trend, towards empirical software engineering research,
requires certain standards to be established which would guide these
collaborative efforts in creating a strong partnership that promotes
independent, evidence-based, software engineering research. This paper examines
key enabling factors for an efficient and effective industry-academia
collaboration in the software testing domain. A major finding of the research
was that while technology is a strong enabler to better collaboration, it must
be complemented with industrial openness to disclose research results and the
use of a dedicated tooling platform. We use as an example an automated test
generation approach that has been developed in the last two years
collaboratively with Bombardier Transportation AB in Sweden
Evaluation of Mutation Testing in a Nuclear Industry Case Study
For software quality assurance, many safety-critical industries appeal to the use of dynamic testing and structural coverage criteria. However, there are reasons to doubt the adequacy of such practices. Mutation testing has been suggested as an alternative or complementary approach but its cost has traditionally hindered its adoption by industry, and there are limited studies applying it to real safety-critical code. This paper evaluates the effectiveness of state-of-the-art mutation testing on safety-critical code from within the U.K. nuclear industry, in terms of revealing flaws in test suites that already meet the structural coverage criteria recommended by relevant safety standards. It also assesses the practical feasibility of implementing such mutation testing in a real setting. We applied a conventional selective mutation approach to a C codebase supplied by a nuclear industry partner and measured the mutation score achieved by the existing test suite. We repeated the experiment using trivial compiler equivalence (TCE) to assess the benefit that it might provide. Using a conventional approach, it first appeared that the existing test suite only killed 82% of the mutants, but applying TCE revealed that it killed 92%. The difference was due to equivalent or duplicate mutants that TCE eliminated. We then added new tests to kill all the surviving mutants, increasing the test suite size by 18% in the process. In conclusion, mutation testing can potentially improve fault detection compared to structural-coverage-guided testing, and may be affordable in a nuclear industry context. The industry feedback on our results was positive, although further evidence is needed from application of mutation testing to software with known real faults
Automatic Software Repair: a Bibliography
This article presents a survey on automatic software repair. Automatic
software repair consists of automatically finding a solution to software bugs
without human intervention. This article considers all kinds of repairs. First,
it discusses behavioral repair where test suites, contracts, models, and
crashing inputs are taken as oracle. Second, it discusses state repair, also
known as runtime repair or runtime recovery, with techniques such as checkpoint
and restart, reconfiguration, and invariant restoration. The uniqueness of this
article is that it spans the research communities that contribute to this body
of knowledge: software engineering, dependability, operating systems,
programming languages, and security. It provides a novel and structured
overview of the diversity of bug oracles and repair operators used in the
literature
Recommended from our members
The Effectiveness of <i>t</i>-Way Test Data Generation
Modern society is increasingly dependent on the correct functioning of software and increasingly so in areas that are considered safety related or safety critical. Therefore, there is an increasing need to be able to verify and validate that the software is in fact correct and will perform its intended function. Many approaches to this problem have been proposed; however, none seems likely to supplant the role of testing in the near future.
If we accept that there is, and will be, a continuing need to be able to test software then the question becomes one of how can this be done effectively, both in terms of ability to detect errors and in terms of cost. One avenue of research that offers prospects of improving both of these aspects is the automatic generation of test data.
There has recently been a large amount of work conducted in this area. One particularly promising direction has been the application of ideas from the field of experimental design and in particular, the field of t-way adequate factorial designs.
The area however, is not without issues; there is evidence that the technique is capable of detecting errors but that evidence is not unequivocal. Moreover, as with almost all work in the area of automatic test generation, there has been very little comparative work comparing the technique with other test data generation techniques. Worse, there has been effectively no work done that compares any automatic test data generation technique with the effectiveness of tests generated by humans. Another major issue with the technique is the number of tests that applying the technique can result in. This implies that there is a need for an automated oracle if the technique is to be successfully applied. The flaw with this is of course that in most situations the oracle is the human that is conducting the tests, a point often ignored in testing research.
The work presented here addresses both of these points. To do this I have used a code base taken from an industrial engine control system that has an existing set of high quality unit tests developed by hand. To complement this, several other techniques for automatically generating test data have been applied, namely random testing, random experimental designs and a technique for generating single factor experiments. To address the issue of being able to compare the error detection ability of all of the sets of test vectors, rather than the usual effectiveness surrogates of code coverage I have used mutation analysis on the code base to directly measure the ability of each set of test vectors to discover common coding errors. The results presented here show that test data generation techniques based on t-way factorial designs are at least as effective as handgenerated tests and superior to random testing and the factor experimental technique.
The oracle problem associated with the factorial design techniques was addressed using a test set minimisation approach. The mutation tool monitored which vectors could “kill” which code mutants. After a subset of the test vectors had been run, the most effective vectors were retained and the rest discarded. Likewise, mutants that were killed were removed from further consideration and the process repeated. Experimental results show that this minimisation procedure is effective at reducing computational overhead and is capable of producing final sets of test vectors that are comparable in size with the sets of hand-generated tests and so amenable to final hand checking
- …