30,084 research outputs found
Reinforcement Learning for Test Case Prioritization
Continuous Integration (CI) significantly reduces integration problems,
speeds up development time, and shortens release time. However, it also
introduces new challenges for quality assurance activities, including
regression testing, which is the focus of this work. Though various approaches
for test case prioritization have shown to be very promising in the context of
regression testing, specific techniques must be designed to deal with the
dynamic nature and timing constraints of CI.
Recently, Reinforcement Learning (RL) has shown great potential in various
challenging scenarios that require continuous adaptation, such as game playing,
real-time ads bidding, and recommender systems. Inspired by this line of work
and building on initial efforts in supporting test case prioritization with RL
techniques, we perform here a comprehensive investigation of RL-based test case
prioritization in a CI context. To this end, taking test case prioritization as
a ranking problem, we model the sequential interactions between the CI
environment and a test case prioritization agent as an RL problem, using three
alternative ranking models. We then rely on carefully selected and tailored
state-of-the-art RL techniques to automatically and continuously learn a test
case prioritization strategy, whose objective is to be as close as possible to
the optimal one. Our extensive experimental analysis shows that the best RL
solutions provide a significant accuracy improvement over previous RL-based
work, with prioritization strategies getting close to being optimal, thus
paving the way for using RL to prioritize test cases in a CI context
A dissimilarity with dice-jaro-winkler test case prioritization approach for model- based testing in software product line
The effectiveness of testing in Model-based Testing (MBT) for Software Product Line (SPL) can be achieved by considering fault detection in test case. The lack of fault consideration caused test case in test suite to be listed randomly. Test Case Prioritization (TCP) is one of regression techniques that is adaptively capable to detect faults as early as possible by reordering test cases based on fault detection rate. However, there is a lack of studies that measured faults in MBT for SPL. This paper proposes a Test Case Prioritization (TCP) approach based on dissimilarity and string based distance called Last Minimal for Local Maximal Distance (LM-LMD) with Dice-Jaro-Winkler Dissimilarity. LM-LMD with Dice-Jaro-Winkler Dissimilarity adopts Local Maximum Distance as the prioritization algorithm and Dice-Jaro-Winkler similarity measure to evaluate distance among test cases. This work is based on the test case generated from statechart in Software Product Line (SPL) domain context. Our results are promising as LM-LMD with Dice-Jaro-Winkler Dissimilarity outperformed the original Local Maximum Distance, Global Maximum Distance and Enhanced All-yes Configuration algorithm in terms of Average Fault Detection Rate (APFD) and average prioritization time
Regression testing framework for test cases generation and prioritization
A regression test is a significant part of software testing. It is used to find the maximum number of faults in software applications. Test Case Prioritization (TCP) is an approach to prioritize and schedule test cases. It is used to detect faults in the earlier stage of testing environment. Code coverage is one of the features of a Regression Test (RT) that detects more number of faults from a software application. However, code coverage and fault detection are reducing the performance of existing test case prioritization by consuming a lot of time for scanning an entire code. The process of generating test cases plays an important role in the prioritization of test cases. The existing automated generation and prioritization techniques produces insufficient test cases that cause less fault detection rate or consumes more computation time to detect more faults. Unified Modelling Language (UML) based test case generation techniques can extract test cases from UML diagrams by covering maximum part of a module of an application. Therefore, a UML based test case generation can support a test case prioritization technique to find a greater number of faults with shorter execution time. A multi-objective optimization technique able to handle multiple objectives that supports RT to generate more number of test cases as well as increase fault detection rate and produce a better result. The aim of this research is to develop a framework to detect maximum number of faults with less execution time for improving the RT. The performance of the RT can be improved by an efficient test case generation and prioritization method based on a multi-objective optimization technique by handling both test cases and rate of fault detection. This framework consists of two important models: Test Case Generation (TCG) and TCP. The TCG model requires an UML use case diagram to extract test cases. A meta heuristic approach is employed that uses tokens for generating test cases. And, TCP receives the extracted test cases with faults as input to produce the prioritized set of test cases. The proposed research has modified the existing Hill Climbing based TCP by altering its test case swapping feature and detect faults in a reasonable execution time. The proposed framework intends to improve the performance of regression testing by generating and prioritizing test cases in order to find a greater number of faults in an application. Two case studies are conducted in the research in order to gather Test Case (TC) and faults for multiple modules. The proposed framework yielded a 92.2% of Average Percentage Fault Detection with less amount of testing time comparing to the other artificial intelligence-based TCP. The findings were proved that the proposed framework produced a sufficient amount of TC and found the maximum number of faults in less amount of time
A Bayesian Framework for Software Regression Testing
Software maintenance reportedly accounts for much of the total cost associated
with developing software. These costs occur because modifying software is a highly
error-prone task. Changing software to correct faults or add new functionality
can cause existing functionality to regress, introducing new faults. To avoid such
defects, one can re-test software after modifications, a task commonly known as
regression testing.
Regression testing typically involves the re-execution of test cases developed for
previous versions. Re-running all existing test cases, however, is often costly and
sometimes even infeasible due to time and resource constraints. Re-running test
cases that do not exercise changed or change-impacted parts of the program carries
extra cost and gives no benefit. The research community has thus sought ways to
optimize regression testing by lowering the cost of test re-execution while preserving
its effectiveness. To this end, researchers have proposed selecting a subset of test
cases according to a variety of criteria (test case selection) and reordering test cases
for execution to maximize a score function (test case prioritization).
This dissertation presents a novel framework for optimizing regression testing
activities, based on a probabilistic view of regression testing. The proposed framework
is built around predicting the probability that each test case finds faults in the
regression testing phase, and optimizing the test suites accordingly. To predict such
probabilities, we model regression testing using a Bayesian Network (BN), a powerful
probabilistic tool for modeling uncertainty in systems. We build this model using
information measured directly from the software system. Our proposed framework
builds upon the existing research in this area in many ways. First, our framework
incorporates different information extracted from software into one model, which
helps reduce uncertainty by using more of the available information, and enables
better modeling of the system. Moreover, our framework provides flexibility by
enabling a choice of which sources of information to use. Research in software
measurement has proven that dealing with different systems requires different techniques
and hence requires such flexibility. Using the proposed framework, engineers
can customize their regression testing techniques to fit the characteristics of their
systems using measurements most appropriate to their environment.
We evaluate the performance of our proposed BN-based framework empirically.
Although the framework can help both test case selection and prioritization, we
propose using it primarily as a prioritization technique. We therefore compare our
technique against other prioritization techniques from the literature. Our empirical
evaluation examines a variety of objects and fault types. The results show that the
proposed framework can outperform other techniques on some cases and performs
comparably on the others.
In sum, this thesis introduces a novel Bayesian framework for optimizing regression
testing and shows that the proposed framework can help testers improve the
cost effectiveness of their regression testing tasks
A regression test case selection and prioritization for object-oriented programs using dependency graph and genetic algorithm
Regression testing is very important activity in software testing. The re-execution of all test cases during regression testing will be costly. The effective and efficient test case selection from the existing test suite becomes very critical issue in regression testing. This paper presents an evolutionary regression test case prioritization for object-oriented software based on dependence graph model analysis of the affected program using Genetic Algorithm. The approach is based on optimization of selected test case from test suite T. The goal is to identify changes in a method's body due to data dependence, control dependence and dependent due to object relation such as inheritance and polymorphism, select the test cases based on affected statements and ordered them based on their fitness by using GA. The number of affected statements determined how fit a test case is good for regression testing. A case study was reported to provide evidence of the feasibility of the approach and its benefits in increasing the rate of fault detection and reduction in regression testing effort. The goodness of this ordering is measured using Average Percentage of rate of Faults Detection (APFD) metric to evaluate the effectiveness and efficiency of the approach. It was observed that our proposed approach is more efficient and effective in regression testing
Reinforcement Learning for Automatic Test Case Prioritization and Selection in Continuous Integration
Testing in Continuous Integration (CI) involves test case prioritization,
selection, and execution at each cycle. Selecting the most promising test cases
to detect bugs is hard if there are uncertainties on the impact of committed
code changes or, if traceability links between code and tests are not
available. This paper introduces Retecs, a new method for automatically
learning test case selection and prioritization in CI with the goal to minimize
the round-trip time between code commits and developer feedback on failed test
cases. The Retecs method uses reinforcement learning to select and prioritize
test cases according to their duration, previous last execution and failure
history. In a constantly changing environment, where new test cases are created
and obsolete test cases are deleted, the Retecs method learns to prioritize
error-prone test cases higher under guidance of a reward function and by
observing previous CI cycles. By applying Retecs on data extracted from three
industrial case studies, we show for the first time that reinforcement learning
enables fruitful automatic adaptive test case selection and prioritization in
CI and regression testing.Comment: Spieker, H., Gotlieb, A., Marijan, D., & Mossige, M. (2017).
Reinforcement Learning for Automatic Test Case Prioritization and Selection
in Continuous Integration. In Proceedings of 26th International Symposium on
Software Testing and Analysis (ISSTA'17) (pp. 12--22). AC
Recommended from our members
Test case prioritization
Regression testing is an expensive software engineering activity intended to provide confidence that modifications to a software system have not introduced faults. Test case prioritization techniques help to reduce regression testing cost by ordering test cases in a way that better achieves testing objectives. In this thesis, we are interested in prioritizing to maximize a test suite's rate of fault detection, measured by a metric, APED, trying to detect regression faults as early as possible during testing. In previous work, several prioritization techniques using low-level code coverage information had been developed. These techniques try to maximize APED over a sequence of software releases, not targeting a particular release. These techniques' effectiveness was empirically evaluated. We present a larger set of prioritization techniques that use information at arbitrary granularity levels and incorporate modification information, targeting prioritization at a particular software release. Our empirical studies show significant improvements in the rate of fault detection over randomly ordered test suites. Previous work on prioritization assumed uniform test costs and fault seventies, which might not be realistic in many practical cases. We present a new cost-cognizant metric, APFD[subscript c], and prioritization techniques, together with approaches for measuring and estimating these costs. Our empirical studies evaluate prioritization in a cost-cognizant environment. Prioritization techniques have been developed independently with little consideration of their similarities. We present a general prioritization framework that allows us to express existing prioritization techniques by a framework algorithm using parameters and specific functions. Previous research assumed that prioritization was always beneficial if it improves the APFD metric. We introduce a prioritization cost-benefit model that more accurately captures relevant cost and benefit factors, and allows practitioners to assess whether it is economical to employ prioritization. Prioritization effectiveness varies across programs, versions, and test suites. We empirically investigate several of these factors on substantial software systems and present a classification-tree-based predictor that can help select the most appropriate prioritization technique in advance. Together, these results improve our understanding of test case prioritization and of the processes by which it is performed
Input Prioritization for Testing Neural Networks
Deep neural networks (DNNs) are increasingly being adopted for sensing and
control functions in a variety of safety and mission-critical systems such as
self-driving cars, autonomous air vehicles, medical diagnostics, and industrial
robotics. Failures of such systems can lead to loss of life or property, which
necessitates stringent verification and validation for providing high
assurance. Though formal verification approaches are being investigated,
testing remains the primary technique for assessing the dependability of such
systems. Due to the nature of the tasks handled by DNNs, the cost of obtaining
test oracle data---the expected output, a.k.a. label, for a given input---is
high, which significantly impacts the amount and quality of testing that can be
performed. Thus, prioritizing input data for testing DNNs in meaningful ways to
reduce the cost of labeling can go a long way in increasing testing efficacy.
This paper proposes using gauges of the DNN's sentiment derived from the
computation performed by the model, as a means to identify inputs that are
likely to reveal weaknesses. We empirically assessed the efficacy of three such
sentiment measures for prioritization---confidence, uncertainty, and
surprise---and compare their effectiveness in terms of their fault-revealing
capability and retraining effectiveness. The results indicate that sentiment
measures can effectively flag inputs that expose unacceptable DNN behavior. For
MNIST models, the average percentage of inputs correctly flagged ranged from
88% to 94.8%
- …