3 research outputs found

    Reinforcement Learning for Test Case Prioritization

    Get PDF
    Continuous Integration (CI) significantly reduces integration problems, speeds up development time, and shortens release time. However, it also introduces new challenges for quality assurance activities, including regression testing, which is the focus of this work. Though various approaches for test case prioritization have shown to be very promising in the context of regression testing, specific techniques must be designed to deal with the dynamic nature and timing constraints of CI. Recently, Reinforcement Learning (RL) has shown great potential in various challenging scenarios that require continuous adaptation, such as game playing, real-time ads bidding, and recommender systems. Inspired by this line of work and building on initial efforts in supporting test case prioritization with RL techniques, we perform here a comprehensive investigation of RL-based test case prioritization in a CI context. To this end, taking test case prioritization as a ranking problem, we model the sequential interactions between the CI environment and a test case prioritization agent as an RL problem, using three alternative ranking models. We then rely on carefully selected and tailored state-of-the-art RL techniques to automatically and continuously learn a test case prioritization strategy, whose objective is to be as close as possible to the optimal one. Our extensive experimental analysis shows that the best RL solutions provide a significant accuracy improvement over previous RL-based work, with prioritization strategies getting close to being optimal, thus paving the way for using RL to prioritize test cases in a CI context

    Resource Allocation Framework: Validation of Numerical Models of Complex Engineering Systems against Physical Experiments

    Get PDF
    An increasing reliance on complex numerical simulations for high consequence decision making is the motivation for experiment-based validation and uncertainty quantification to assess, and when needed, to improve the predictive capabilities of numerical models. Uncertainties and biases in model predictions can be reduced by taking two distinct actions: (i) increasing the number of experiments in the model calibration process, and/or (ii) improving the physics sophistication of the numerical model. Therefore, decision makers must select between further code development and experimentation while allocating the finite amount of available resources. This dissertation presents a novel framework to assist in this selection between experimentation and code development for model validation strictly from the perspective of predictive capability. The reduction and convergence of discrepancy bias between model prediction and observation, computed using a suitable convergence metric, play a key role in the conceptual formulation of the framework. The proposed framework is demonstrated using two non-trivial case study applications on the Preston-Tonks-Wallace (PTW) code, which is a continuum-based plasticity approach to modeling metals, and the ViscoPlastic Self-Consistent (VPSC) code which is a mesoscopic plasticity approach to modeling crystalline materials. Results show that the developed resource allocation framework is effective and efficient in path selection (i.e. experimentation and/or code development) resulting in a reduction in both model uncertainties and discrepancy bias. The framework developed herein goes beyond path selection in the validation of numerical models by providing a methodology for the prioritization of optimal experimental settings and an algorithm for prioritization of code development. If the path selection algorithm selects the experimental path, optimal selection of the settings at which these physical experiments are conducted as well as the sequence of these experiments is vital to maximize the gain in predictive capability of a model. The Batch Sequential Design (BSD) is a methodology utilized in this work to achieve the goal of selecting the optimal experimental settings. A new BSD selection criterion, Coverage Augmented Expected Improvement for Predictive Stability (C-EIPS), is developed to minimize the maximum reduction in the model discrepancy bias and coverage of the experiments within the domain of applicability. The functional form of the new criterion, C-EIPS, is demonstrated to outperform its predecessor, the EIPS criterion, and the distance-based criterion when discrepancy bias is high and coverage is low, while exhibiting a comparable performance to the distance-based criterion in efficiently maximizing the predictive capability of the VPSC model as discrepancy decreases and coverage increases. If the path selection algorithm selects the code development path, the developed framework provides an algorithm for the prioritization of code development efforts. In coupled systems, the predictive accuracy of the simulation hinges on the accuracy of individual constituent models. Potential improvement in the predictive accuracy of the simulation that can be gained through improving a constituent model depends not only on the relative importance, but also on the inherent uncertainty and inaccuracy of that particular constituent. As such, a unique and quantitative code prioritization index (CPI) is proposed to accomplish the task of prioritizing code development efforts, and its application is demonstrated on a case study of a steel frame with semi-rigid connections. Findings show that the CPI is effective in identifying the most critical constituent of the coupled system, whose improvement leads to the highest overall enhancement of the predictive capability of the coupled model

    Model based test suite minimization using metaheuristics

    Get PDF
    Software testing is one of the most widely used methods for quality assurance and fault detection purposes. However, it is one of the most expensive, tedious and time consuming activities in software development life cycle. Code-based and specification-based testing has been going on for almost four decades. Model-based testing (MBT) is a relatively new approach to software testing where the software models as opposed to other artifacts (i.e. source code) are used as primary source of test cases. Models are simplified representation of a software system and are cheaper to execute than the original or deployed system. The main objective of the research presented in this thesis is the development of a framework for improving the efficiency and effectiveness of test suites generated from UML models. It focuses on three activities: transformation of Activity Diagram (AD) model into Colored Petri Net (CPN) model, generation and evaluation of AD based test suite and optimization of AD based test suite. Unified Modeling Language (UML) is a de facto standard for software system analysis and design. UML models can be categorized into structural and behavioral models. AD is a behavioral type of UML model and since major revision in UML version 2.x it has a new Petri Nets like semantics. It has wide application scope including embedded, workflow and web-service systems. For this reason this thesis concentrates on AD models. Informal semantics of UML generally and AD specially is a major challenge in the development of UML based verification and validation tools. One solution to this challenge is transforming a UML model into an executable formal model. In the thesis, a three step transformation methodology is proposed for resolving ambiguities in an AD model and then transforming it into a CPN representation which is a well known formal language with extensive tool support. Test case generation is one of the most critical and labor intensive activities in testing processes. The flow oriented semantic of AD suits modeling both sequential and concurrent systems. The thesis presented a novel technique to generate test cases from AD using a stochastic algorithm. In order to determine if the generated test suite is adequate, two test suite adequacy analysis techniques based on structural coverage and mutation have been proposed. In terms of structural coverage, two separate coverage criteria are also proposed to evaluate the adequacy of the test suite from both perspectives, sequential and concurrent. Mutation analysis is a fault-based technique to determine if the test suite is adequate for detecting particular types of faults. Four categories of mutation operators are defined to seed specific faults into the mutant model. Another focus of thesis is to improve the test suite efficiency without compromising its effectiveness. One way of achieving this is identifying and removing the redundant test cases. It has been shown that the test suite minimization by removing redundant test cases is a combinatorial optimization problem. An evolutionary computation based test suite minimization technique is developed to address the test suite minimization problem and its performance is empirically compared with other well known heuristic algorithms. Additionally, statistical analysis is performed to characterize the fitness landscape of test suite minimization problems. The proposed test suite minimization solution is extended to include multi-objective minimization. As the redundancy is contextual, different criteria and their combination can significantly change the solution test suite. Therefore, the last part of the thesis describes an investigation into multi-objective test suite minimization and optimization algorithms. The proposed framework is demonstrated and evaluated using prototype tools and case study models. Empirical results have shown that the techniques developed within the framework are effective in model based test suite generation and optimizatio
    corecore