56 research outputs found
Automated Repair of Feature Interaction Failures in Automated Driving Systems
In the past years, several automated repair strategies have been
proposed to fix bugs in individual software programs without any
human intervention. There has been, however, little work on how
automated repair techniques can resolve failures that arise at the
system-level and are caused by undesired interactions among different
system components or functions. Feature interaction failures
are common in complex systems such as autonomous cars that are
typically built as a composition of independent features (i.e., units
of functionality). In this paper, we propose a repair technique to
automatically resolve undesired feature interaction failures in automated
driving systems (ADS) that lead to the violation of system
safety requirements. Our repair strategy achieves its goal by (1) localizing
faults spanning several lines of code, (2) simultaneously
resolving multiple interaction failures caused by independent faults,
(3) scaling repair strategies from the unit-level to the system-level,
and (4) resolving failures based on their order of severity. We have
evaluated our approach using two industrial ADS containing four
features. Our results show that our repair strategy resolves the
undesired interaction failures in these two systems in less than 16h
and outperforms existing automated repair techniques
Achievements, open problems and challenges for search based software testing
Search Based Software Testing (SBST) formulates testing as an optimisation problem, which can be attacked using computational search techniques from the field of Search Based Software Engineering (SBSE). We present an analysis of the SBST research agenda, focusing on the open problems and challenges of testing non-functional properties, in particular a topic we call 'Search Based Energy Testing' (SBET), Multi-objective SBST and SBST for Test Strategy Identification. We conclude with a vision of FIFIVERIFY tools, which would automatically find faults, fix them and verify the fixes. We explain why we think such FIFIVERIFY tools constitute an exciting challenge for the SBSE community that already could be within its reach
Angels and monsters: An empirical investigation of potential test effectiveness and efficiency improvement from strongly subsuming higher order mutation
We study the simultaneous test effectiveness and efficiency improvement achievable by Strongly Subsuming Higher Order Mutants (SSHOMs), constructed from 15,792 first order mutants in four Java programs. Using SSHOMs in place of the first order mutants they subsume yielded a 35%-45% reduction in the number of mutants required, while simultaneously improving test efficiency by 15% and effectiveness by between 5.6% and 12%. Trivial first order faults often combine to form exceptionally non-trivial higher order faults; apparently innocuous angels can combine to breed monsters. Nevertheless, these same monsters can be recruited to improve automated test effectiveness and efficiency
An Analysis of Patch Plausibility and Correctness for Generate-And-Validate Patch Generation Systems
We analyze reported patches for three existing generate-and-validate patch generation systems (GenProg, RSRepair, and AE). The basic principle behind generate-and-validate systems is to accept only plausible patches that produce correct outputs for all inputs in the test suite used to validate the patches. Because of errors in the patch evaluation infrastructure, the majority of the reported patches are not plausible --- they do not produce correct outputs even for the inputs in the validation test suite. The overwhelming majority of the reported patches are not correct and are equivalent to a single modification that simply deletes functionality. Observed negative effects include the introduction of security vulnerabilities and the elimination of desirable standard functionality. We also present Kali, a generate-and-validate patch generation system that only deletes functionality. Working with a simpler and more effectively focused search space, Kali generates at least as many correct patches as prior GenProg, RSRepair, and AE systems. Kali also generates at least as many patches that produce correct outputs for the inputs in the validation test suite as the three prior systems. We also discuss patches produced by ClearView, a generate-and-validate binary hot patching system that leverages learned invariants to produce patches that enable systems to survive otherwise fatal defects and security attacks. Our analysis indicates that ClearView successfully patches 9 of the 10 security vulnerabilities used to evaluate the system. At least 4 of these patches are correct
- …