4,407 research outputs found
Learning How to Search: Generating Exception-Triggering Tests Through Adaptive Fitness Function Selection
Search-based test generation is guided by feedback from one or more fitness functions—scoring functions that judge solution optimality. Choosing informative fitness functions is crucial to meeting the goals of a tester. Unfortunately, many goals—such as forcing the class-under-test to throw exceptions— do not have a known fitness function formulation. We propose that meeting such goals requires treating fitness function identification as a secondary optimization step. An adaptive algorithm that can vary the selection of fitness functions could adjust its selection throughout the generation process to maximize goal attainment, based on the current population of test suites. To test this hypothesis, we have implemented two reinforcement learning algorithms in the EvoSuite framework, and used these algorithms to dynamically set the fitness functions used during generation.We have evaluated our framework, EvoSuiteFIT, on a set of 386 real faults. EvoSuiteFIT discovers and retains more exception-triggering input and produces suites that detect a variety of faults missed by the other techniques. The ability to adjust fitness functions allows EvoSuiteFIT to make strategic choices that efficiently produce more effective test suites
Test Set Diameter: Quantifying the Diversity of Sets of Test Cases
A common and natural intuition among software testers is that test cases need
to differ if a software system is to be tested properly and its quality
ensured. Consequently, much research has gone into formulating distance
measures for how test cases, their inputs and/or their outputs differ. However,
common to these proposals is that they are data type specific and/or calculate
the diversity only between pairs of test inputs, traces or outputs.
We propose a new metric to measure the diversity of sets of tests: the test
set diameter (TSDm). It extends our earlier, pairwise test diversity metrics
based on recent advances in information theory regarding the calculation of the
normalized compression distance (NCD) for multisets. An advantage is that TSDm
can be applied regardless of data type and on any test-related information, not
only the test inputs. A downside is the increased computational time compared
to competing approaches.
Our experiments on four different systems show that the test set diameter can
help select test sets with higher structural and fault coverage than random
selection even when only applied to test inputs. This can enable early test
design and selection, prior to even having a software system to test, and
complement other types of test automation and analysis. We argue that this
quantification of test set diversity creates a number of opportunities to
better understand software quality and provides practical ways to increase it.Comment: In submissio
System Description for a Scalable, Fault-Tolerant, Distributed Garbage Collector
We describe an efficient and fault-tolerant algorithm for distributed cyclic
garbage collection. The algorithm imposes few requirements on the local
machines and allows for flexibility in the choice of local collector and
distributed acyclic garbage collector to use with it. We have emphasized
reducing the number and size of network messages without sacrificing the
promptness of collection throughout the algorithm. Our proposed collector is a
variant of back tracing to avoid extensive synchronization between machines. We
have added an explicit forward tracing stage to the standard back tracing stage
and designed a tuned heuristic to reduce the total amount of work done by the
collector. Of particular note is the development of fault-tolerant cooperation
between traces and a heuristic that aggressively reduces the set of suspect
objects.Comment: 47 pages, LaTe
Empirical Evaluation of Mutation-based Test Prioritization Techniques
We propose a new test case prioritization technique that combines both
mutation-based and diversity-based approaches. Our diversity-aware
mutation-based technique relies on the notion of mutant distinguishment, which
aims to distinguish one mutant's behavior from another, rather than from the
original program. We empirically investigate the relative cost and
effectiveness of the mutation-based prioritization techniques (i.e., using both
the traditional mutant kill and the proposed mutant distinguishment) with 352
real faults and 553,477 developer-written test cases. The empirical evaluation
considers both the traditional and the diversity-aware mutation criteria in
various settings: single-objective greedy, hybrid, and multi-objective
optimization. The results show that there is no single dominant technique
across all the studied faults. To this end, \rev{we we show when and the reason
why each one of the mutation-based prioritization criteria performs poorly,
using a graphical model called Mutant Distinguishment Graph (MDG) that
demonstrates the distribution of the fault detecting test cases with respect to
mutant kills and distinguishment
Experimental analysis of computer system dependability
This paper reviews an area which has evolved over the past 15 years: experimental analysis of computer system dependability. Methodologies and advances are discussed for three basic approaches used in the area: simulated fault injection, physical fault injection, and measurement-based analysis. The three approaches are suited, respectively, to dependability evaluation in the three phases of a system's life: design phase, prototype phase, and operational phase. Before the discussion of these phases, several statistical techniques used in the area are introduced. For each phase, a classification of research methods or study topics is outlined, followed by discussion of these methods or topics as well as representative studies. The statistical techniques introduced include the estimation of parameters and confidence intervals, probability distribution characterization, and several multivariate analysis methods. Importance sampling, a statistical technique used to accelerate Monte Carlo simulation, is also introduced. The discussion of simulated fault injection covers electrical-level, logic-level, and function-level fault injection methods as well as representative simulation environments such as FOCUS and DEPEND. The discussion of physical fault injection covers hardware, software, and radiation fault injection methods as well as several software and hybrid tools including FIAT, FERARI, HYBRID, and FINE. The discussion of measurement-based analysis covers measurement and data processing techniques, basic error characterization, dependency analysis, Markov reward modeling, software-dependability, and fault diagnosis. The discussion involves several important issues studies in the area, including fault models, fast simulation techniques, workload/failure dependency, correlated failures, and software fault tolerance
A survey of machine learning techniques applied to self organizing cellular networks
In this paper, a survey of the literature of the past fifteen years involving Machine Learning (ML) algorithms applied to self organizing cellular networks is performed. In order for future networks to overcome the current limitations and address the issues of current cellular systems, it is clear that more intelligence needs to be deployed, so that a fully autonomous and flexible network can be enabled. This paper focuses on the learning perspective of Self Organizing Networks (SON) solutions and provides, not only an overview of the most common ML techniques encountered in cellular networks, but also manages to classify each paper in terms of its learning solution, while also giving some examples. The authors also classify each paper in terms of its self-organizing use-case and discuss how each proposed solution performed. In addition, a comparison between the most commonly found ML algorithms in terms of certain SON metrics is performed and general guidelines on when to choose each ML algorithm for each SON function are proposed. Lastly, this work also provides future research directions and new paradigms that the use of more robust and intelligent algorithms, together with data gathered by operators, can bring to the cellular networks domain and fully enable the concept of SON in the near future
Test Case Selection Using CBIR and Clustering
Choosing test cases for the optimization process of information systems testing is crucial, because it helps to eliminate unnecessary and redundant testing data. However, its use in systems that address complex domains (e.g. images) is still underexplored. This paper presents a new approach that uses Content-Based Image Retrieval (CBIR), similarity functions and clustering techniques to select test cases from an image-based test suite. Two experiments performed on an image processing system show that our approach, when compared with random tests, can significantly enhance the performance of tests execution by reducing the test cases required to find a fault. The results also show the potential use of CBIR for information abstraction, as well as the effectiveness of similarity functions and clustering for test case selection
- …