74,719 research outputs found
Code coverage of adaptive random testing
Random testing is a basic software testing technique that can be used to assess the software reliability as well as to detect software failures. Adaptive random testing has been proposed to enhance the failure-detection capability of random testing. Previous studies have shown that adaptive random testing can use fewer test cases than random testing to detect the first software failure. In this paper, we evaluate and compare the performance of adaptive random testing and random testing from another perspective, that of code coverage. As shown in various investigations, a higher code coverage not only brings a higher failure-detection capability, but also improves the effectiveness of software reliability estimation. We conduct a series of experiments based on two categories of code coverage criteria: structure-based coverage, and fault-based coverage. Adaptive random testing can achieve higher code coverage than random testing with the same number of test cases. Our experimental results imply that, in addition to having a better failure-detection capability than random testing, adaptive random testing also delivers a higher effectiveness in assessing software reliability, and a higher confidence in the reliability of the software under test even when no failure is detected
Code coverage of adaptive random testing
Random testing is a basic software testing technique that can be used to assess the software reliability as well as to detect software failures. Adaptive random testing has been proposed to enhance the failure-detection capability of random testing. Previous studies have shown that adaptive random testing can use fewer test cases than random testing to detect the first software failure. In this paper, we evaluate and compare the performance of adaptive random testing and random testing from another perspective, that of code coverage. As shown in various investigations, a higher code coverage not only brings a higher failure-detection capability, but also improves the effectiveness of software reliability estimation. We conduct a series of experiments based on two categories of code coverage criteria: structure-based coverage, and fault-based coverage. Adaptive random testing can achieve higher code coverage than random testing with the same number of test cases. Our experimental results imply that, in addition to having a better failure-detection capability than random testing, adaptive random testing also delivers a higher effectiveness in assessing software reliability, and a higher confidence in the reliability of the software under test even when no failure is detected
Regression test case prioritization by code combinations coverage
Regression test case prioritization (RTCP) aims to improve the rate of fault detection by executing more important test cases as early as possible. Various RTCP techniques have been proposed based on different coverage criteria. Among them, a majority of techniques leverage code coverage information to guide the prioritization process, with code units being considered individually, and in isolation. In this paper, we propose a new coverage criterion, code combinations coverage, that combines the concepts of code coverage and combination coverage. We apply this coverage criterion to RTCP, as a new prioritization technique, code combinations coverage based prioritization (CCCP). We report on empirical studies conducted to compare the testing effectiveness and efficiency of CCCP with four popular RTCP techniques: total, additional, adaptive random, and search-based test prioritization. The experimental results show that even when the lowest combination strength is assigned, overall, the CCCP fault detection rates are greater than those of the other four prioritization techniques. The CCCP prioritization costs are also found to be comparable to the additional test prioritization technique. Moreover, our results also show that when the combination strength is increased, CCCP provides higher fault detection rates than the state-of-the-art, regardless of the levels of code coverage
Test Set Diameter: Quantifying the Diversity of Sets of Test Cases
A common and natural intuition among software testers is that test cases need
to differ if a software system is to be tested properly and its quality
ensured. Consequently, much research has gone into formulating distance
measures for how test cases, their inputs and/or their outputs differ. However,
common to these proposals is that they are data type specific and/or calculate
the diversity only between pairs of test inputs, traces or outputs.
We propose a new metric to measure the diversity of sets of tests: the test
set diameter (TSDm). It extends our earlier, pairwise test diversity metrics
based on recent advances in information theory regarding the calculation of the
normalized compression distance (NCD) for multisets. An advantage is that TSDm
can be applied regardless of data type and on any test-related information, not
only the test inputs. A downside is the increased computational time compared
to competing approaches.
Our experiments on four different systems show that the test set diameter can
help select test sets with higher structural and fault coverage than random
selection even when only applied to test inputs. This can enable early test
design and selection, prior to even having a software system to test, and
complement other types of test automation and analysis. We argue that this
quantification of test set diversity creates a number of opportunities to
better understand software quality and provides practical ways to increase it.Comment: In submissio
Reinforcement Learning for Automatic Test Case Prioritization and Selection in Continuous Integration
Testing in Continuous Integration (CI) involves test case prioritization,
selection, and execution at each cycle. Selecting the most promising test cases
to detect bugs is hard if there are uncertainties on the impact of committed
code changes or, if traceability links between code and tests are not
available. This paper introduces Retecs, a new method for automatically
learning test case selection and prioritization in CI with the goal to minimize
the round-trip time between code commits and developer feedback on failed test
cases. The Retecs method uses reinforcement learning to select and prioritize
test cases according to their duration, previous last execution and failure
history. In a constantly changing environment, where new test cases are created
and obsolete test cases are deleted, the Retecs method learns to prioritize
error-prone test cases higher under guidance of a reward function and by
observing previous CI cycles. By applying Retecs on data extracted from three
industrial case studies, we show for the first time that reinforcement learning
enables fruitful automatic adaptive test case selection and prioritization in
CI and regression testing.Comment: Spieker, H., Gotlieb, A., Marijan, D., & Mossige, M. (2017).
Reinforcement Learning for Automatic Test Case Prioritization and Selection
in Continuous Integration. In Proceedings of 26th International Symposium on
Software Testing and Analysis (ISSTA'17) (pp. 12--22). AC
Target Directed Event Sequence Generation for Android Applications
Testing is a commonly used approach to ensure the quality of software, of
which model-based testing is a hot topic to test GUI programs such as Android
applications (apps). Existing approaches mainly either dynamically construct a
model that only contains the GUI information, or build a model in the view of
code that may fail to describe the changes of GUI widgets during runtime.
Besides, most of these models do not support back stack that is a particular
mechanism of Android. Therefore, this paper proposes a model LATTE that is
constructed dynamically with consideration of the view information in the
widgets as well as the back stack, to describe the transition between GUI
widgets. We also propose a label set to link the elements of the LATTE model to
program snippets. The user can define a subset of the label set as a target for
the testing requirements that need to cover some specific parts of the code. To
avoid the state explosion problem during model construction, we introduce a
definition "state similarity" to balance the model accuracy and analysis cost.
Based on this model, a target directed test generation method is presented to
generate event sequences to effectively cover the target. The experiments on
several real-world apps indicate that the generated test cases based on LATTE
can reach a high coverage, and with the model we can generate the event
sequences to cover a given target with short event sequences
- …