17,164 research outputs found

    An Empirical Comparison of Combinatorial Testing, Random Testing and Adaptive Random Testing

    Get PDF
    We present an empirical comparison of three test generation techniques, namely, Combinatorial Testing (CT), Random Testing (RT) and Adaptive Random Testing (ART), under different test scenarios. This is the first study in the literature to account for the (more realistic) testing setting in which the tester may not have complete information about the parameters and constraints that pertain to the system, and to account for the challenge posed by faults (in terms of failure rate). Our study was conducted on nine real-world programs under a total of 1683 test scenarios (combinations of available parameter and constraint information and failure rate). The results show significant differences in the techniques' fault detection ability when faults are hard to detect (failure rates are relatively low). CT performs best overall; no worse than any other in 98% of scenarios studied. ART enhances RT, and is comparable to CT in 96% of scenarios, but its computational cost can be up to 3.5 times higher than CT when the program is highly constrained. Additionally, when constraint information is unavailable for a highly-constrained program, a large random test suite is as effective as CT or ART, yet its computational cost of test generation is significantly lower than that of other techniques

    Fuzzy Adaptive Tuning of a Particle Swarm Optimization Algorithm for Variable-Strength Combinatorial Test Suite Generation

    Full text link
    Combinatorial interaction testing is an important software testing technique that has seen lots of recent interest. It can reduce the number of test cases needed by considering interactions between combinations of input parameters. Empirical evidence shows that it effectively detects faults, in particular, for highly configurable software systems. In real-world software testing, the input variables may vary in how strongly they interact, variable strength combinatorial interaction testing (VS-CIT) can exploit this for higher effectiveness. The generation of variable strength test suites is a non-deterministic polynomial-time (NP) hard computational problem \cite{BestounKamalFuzzy2017}. Research has shown that stochastic population-based algorithms such as particle swarm optimization (PSO) can be efficient compared to alternatives for VS-CIT problems. Nevertheless, they require detailed control for the exploitation and exploration trade-off to avoid premature convergence (i.e. being trapped in local optima) as well as to enhance the solution diversity. Here, we present a new variant of PSO based on Mamdani fuzzy inference system \cite{Camastra2015,TSAKIRIDIS2017257,KHOSRAVANIAN2016280}, to permit adaptive selection of its global and local search operations. We detail the design of this combined algorithm and evaluate it through experiments on multiple synthetic and benchmark problems. We conclude that fuzzy adaptive selection of global and local search operations is, at least, feasible as it performs only second-best to a discrete variant of PSO, called DPSO. Concerning obtaining the best mean test suite size, the fuzzy adaptation even outperforms DPSO occasionally. We discuss the reasons behind this performance and outline relevant areas of future work.Comment: 21 page

    Computationally Tractable Algorithms for Finding a Subset of Non-defective Items from a Large Population

    Full text link
    In the classical non-adaptive group testing setup, pools of items are tested together, and the main goal of a recovery algorithm is to identify the "complete defective set" given the outcomes of different group tests. In contrast, the main goal of a "non-defective subset recovery" algorithm is to identify a "subset" of non-defective items given the test outcomes. In this paper, we present a suite of computationally efficient and analytically tractable non-defective subset recovery algorithms. By analyzing the probability of error of the algorithms, we obtain bounds on the number of tests required for non-defective subset recovery with arbitrarily small probability of error. Our analysis accounts for the impact of both the additive noise (false positives) and dilution noise (false negatives). By comparing with the information theoretic lower bounds, we show that the upper bounds on the number of tests are order-wise tight up to a log2K\log^2K factor, where KK is the number of defective items. We also provide simulation results that compare the relative performance of the different algorithms and provide further insights into their practical utility. The proposed algorithms significantly outperform the straightforward approaches of testing items one-by-one, and of first identifying the defective set and then choosing the non-defective items from the complement set, in terms of the number of measurements required to ensure a given success rate.Comment: In this revision: Unified some proofs and reorganized the paper, corrected a small mistake in one of the proofs, added more reference

    Efficient Benchmarking of Algorithm Configuration Procedures via Model-Based Surrogates

    Get PDF
    The optimization of algorithm (hyper-)parameters is crucial for achieving peak performance across a wide range of domains, ranging from deep neural networks to solvers for hard combinatorial problems. The resulting algorithm configuration (AC) problem has attracted much attention from the machine learning community. However, the proper evaluation of new AC procedures is hindered by two key hurdles. First, AC benchmarks are hard to set up. Second and even more significantly, they are computationally expensive: a single run of an AC procedure involves many costly runs of the target algorithm whose performance is to be optimized in a given AC benchmark scenario. One common workaround is to optimize cheap-to-evaluate artificial benchmark functions (e.g., Branin) instead of actual algorithms; however, these have different properties than realistic AC problems. Here, we propose an alternative benchmarking approach that is similarly cheap to evaluate but much closer to the original AC problem: replacing expensive benchmarks by surrogate benchmarks constructed from AC benchmarks. These surrogate benchmarks approximate the response surface corresponding to true target algorithm performance using a regression model, and the original and surrogate benchmark share the same (hyper-)parameter space. In our experiments, we construct and evaluate surrogate benchmarks for hyperparameter optimization as well as for AC problems that involve performance optimization of solvers for hard combinatorial problems, drawing training data from the runs of existing AC procedures. We show that our surrogate benchmarks capture overall important characteristics of the AC scenarios, such as high- and low-performing regions, from which they were derived, while being much easier to use and orders of magnitude cheaper to evaluate

    A controlled migration genetic algorithm operator for hardware-in-the-loop experimentation

    Get PDF
    In this paper, we describe the development of an extended migration operator, which combats the negative effects of noise on the effective search capabilities of genetic algorithms. The research is motivated by the need to minimize the num- ber of evaluations during hardware-in-the-loop experimentation, which can carry a significant cost penalty in terms of time or financial expense. The authors build on previous research, where convergence for search methods such as Simulated Annealing and Variable Neighbourhood search was accelerated by the implementation of an adaptive decision support operator. This methodology was found to be effective in searching noisy data surfaces. Providing that noise is not too significant, Genetic Al- gorithms can prove even more effective guiding experimentation. It will be shown that with the introduction of a Controlled Migration operator into the GA heuristic, data, which repre- sents a significant signal-to-noise ratio, can be searched with significant beneficial effects on the efficiency of hardware-in-the- loop experimentation, without a priori parameter tuning. The method is tested on an engine-in-the-loop experimental example, and shown to bring significant performance benefits
    corecore