208 research outputs found
Test-aware combinatorial interaction testing
Combinatorial interaction testing (CIT) approaches system- atically sample a given configuration space and select a set of configurations, in which each valid t-way option setting combination appears at least once. A battery of test cases are then executed in the selected configurations. Exist- ing CIT approaches, however, do not provide a system- atic way of handling test-specific inter-option constraints. Improper handling of such constraints, on the other hand, causes masking effects, which in turn causes testers to de- velop false confidence in their test processes, believing them have tested certain option setting combinations, when they in fact have not. In this work, to avoid the harmful conse- quences of masking effects caused by improper handling of test-specific constraints, we compute t-way test-aware cov- ering arrays. A t-way test-aware covering array is not just a set of configurations as is the case in traditional covering arrays, but a set of configurations, each of which is asso- ciated with a set of test cases. We furthermore present a set of empirical studies conducted by using two widely-used highly-configurable software systems as our subject applica- tions, demonstrating that test-specific constraints are likely to occur in practice and the proposed approach is a promis- ing and effective way of handling them
Using hardware performance counters for fault localization
In this work, we leverage hardware performance counters-collected data as abstraction mechanisms for program executions and use these abstractions to identify likely causes of failures. Our approach can be summarized as follows: Hardware counters-based data is collected from both successful and failed executions, the data collected from the successful executions is used to create normal behavior models of programs, and deviations from these models observed in failed executions are scored and reported as likely causes of failures. The results of our experiments conducted on three open source projects suggest that the proposed approach can effectively prioritize the space of likely causes of failures, which can in turn improve the turn around time for defect fixes
Combining hardware and software instrumentation to classify program executions
Several research efforts have studied ways to infer properties of software systems from program spectra gathered from the running systems, usually with software-level instrumentation. While these efforts appear to produce accurate classifications, detailed understanding of their costs and potential cost-benefit tradeoffs is lacking. In this work we present a hybrid instrumentation approach which uses hardware performance counters to gather program spectra at very low cost. This underlying data is further augmented with data captured by minimal amounts of software-level instrumentation. We also
evaluate this hybrid approach by comparing it to other existing approaches. We conclude that these hybrid spectra can reliably distinguish failed executions from successful executions at a fraction of the runtime overhead cost of using software-based execution data
Feedback driven adaptive combinatorial testing
The configuration spaces of modern software systems are too large to test exhaustively. Combinatorial interaction testing (CIT) approaches, such as covering arrays, systematically sample the configuration space and test only the selected configurations. The basic justification for CIT approaches is that they can cost-effectively exercise all system behaviors caused by the settings of t or fewer options. We conjecture, however, that in practice many such behaviors are not actually tested because of masking effects – failures that perturb execution so as to prevent some behaviors from being exercised. In this work we present a feedback-driven, adaptive, combinatorial testing approach aimed at detecting and working around masking effects. At each iteration we detect potential masking effects, heuristically isolate their likely causes, and then generate new covering arrays that allow previously masked combinations to be tested in the subsequent iteration. We empirically assess the effectiveness of the proposed approach on two large widely used open source software systems. Our results suggest that masking effects do exist and that our approach provides a promising and efficient way to work around them
Seer: a lightweight online failure prediction approach
Online failure prediction aims to predict the manifestation of failures at runtime before the failures actually occur. Existing online failure prediction approaches typically operate on data which is either directly reported by the system under test or directly observable from outside system executions. These approaches generally refrain themselves from collecting internal execution data that can further improve the prediction quality. One reason behind this general trend is due to the runtime overhead cost incurred by the measurement instruments that are required to collect the data. In this work we conjecture that large cost reductions in collecting internal execution data for online failure prediction can derive from reducing the cost of the measurement instruments, while still supporting acceptable levels of prediction quality. To evaluate this conjecture, we present a lightweight online failure prediction approach, called Seer. Seer uses fast hardware performance counters to perform most of the data collection work. The data is augmented with further data collected by a minimal amount of software instrumentation that is added to the systems software. We refer to the data collected in this manner as hybrid spectra. We applied the proposed approach to three widely used open source subject applications and evaluated it by comparing and contrasting three types of hybrid spectra and two types of traditional software spectra. At the lowest level of runtime overheads attained in the experiments, the hybrid spectra predicted the failures about half way through the executions with an F-measure of 0.77 and a runtime overhead of 1.98%, on average. Comparing hybrid spectra to software spectra, we observed that, for comparable runtime overhead levels, the hybrid spectra provided significantly better prediction accuracies and earlier warnings for failures than the software spectra. Alternatively, for comparable accuracy levels, the hybrid spectra incurred significantly less runtime overheads and provided earlier warnings
Enumerator: an efficient approach for enumerating all valid t-tuples
In this paper, we present an efficient approach for enumerating all valid
t-tuples for a given configuration space model, which is an important
task in computing covering arrays. The results of our experiments suggest that the proposed approach scales better than existing approaches
Answer-set programming as a new approach to event-sequence testing
In many applications, faults are triggered by events that occur in a particular order. Based on the assumption that most bugs are caused by the interaction of a low number of events, Kuhn et al. recently introduced sequence covering arrays (SCAs) as suitable designs for event sequence testing. In practice, directly applying SCAs for testing is often impaired by additional constraints, and SCAs have to be adapted to fit application-specific needs. Modifying precomputed SCAs to account for problem variations can be problematic, if not impossible, and developing dedicated algorithms is costly. In this paper, we propose answer-set programming (ASP), a well-known knowledge-representation formalism from the area of artificial intelligence based on logic programming, as a declarative paradigm for computing SCAs. Our approach allows to concisely state complex coverage criteria in an elaboration tolerant way, i.e., small variations of a problem specification require only small modifications of the ASP representation
Using Screenshot Attachments in Issue Reports for Triaging
In previous work, we deployed IssueTAG, which uses the texts present in the
one-line summary and the description fields of the issue reports to
automatically assign them to the stakeholders, who are responsible for
resolving the reported issues. Since its deployment on January 12, 2018 at
Softtech, i.e., the software subsidiary of the largest private bank in Turkey,
IssueTAG has made a total of 301,752 assignments (as of November 2021). One
observation we make is that a large fraction of the issue reports submitted to
Softtech has screenshot attachments and, in the presence of such attachments,
the reports often convey less information in their one-line summary and the
description fields, which tends to reduce the assignment accuracy. In this
work, we use the screenshot attachments as an additional source of information
to further improve the assignment accuracy, which (to the best of our
knowledge) has not been studied before in this context. In particular, we
develop a number of multi-source (using both the issue reports and the
screenshot attachments) and single-source assignment models (using either the
issue reports or the screenshot attachments) and empirically evaluate them on
real issue reports. In the experiments, compared to the currently deployed
single-source model in the field, the best multi-source model developed in this
work, significantly (both in the practical and statistical sense) improved the
assignment accuracy for the issue reports with screenshot attachments from
0.843 to 0.858 at acceptable overhead costs, a result strongly supporting our
basic hypothesis.Comment: Preprint for EMSE journa
Moving forward with combinatorial interaction testing
Combinatorial interaction testing (CIT) is an efficient and effective method of detecting failures that are caused by the interactions of various system input parameters. In this paper, we discuss CIT, point out some of the difficulties of applying it in practice, and highlight some recent advances that have improved CIT’s applicability to modern systems. We also provide a roadmap for future research and directions; one that we hope will lead to new CIT research and to higher quality testing of industrial systems
- …