94 research outputs found
Do Judge a Test by its Cover: Combining Combinatorial and Property-Based Testing
Property-based testing uses randomly generated inputs to validate high-level program specifications. It can be shockingly effective at finding bugs, but it often requires generating a very large number of inputs to do so. In this paper, we apply ideas from combinatorial testing, a powerful and widely studied testing methodology, to modify the distributions of our random generators so as to find bugs with fewer tests. The key concept is combinatorial coverage, which measures the degree to which a given set of tests exercises every possible choice of values for every small combination of input features. In its “classical” form, combinatorial coverage only applies to programs whose inputs have a very particular shape—essentially, a Cartesian product of finite sets. We generalize combinatorial coverage to the richer world of algebraic data types by formalizing a class of sparse test descriptions based on regular tree expressions. This new definition of coverage inspires a novel combinatorial thinning algorithm for improving the coverage of random test generators, requiring many fewer tests to catch bugs. We evaluate this algorithm on two case studies, a typed evaluator for System F terms and a Haskell compiler, showing significant improvements in both
Building High Strength Mixed Covering Arrays with Constraints
Covering arrays have become a key piece in Combinatorial Testing. In particular, we focus on the efficient construction of Covering Arrays with Constraints of high strength. SAT solving technology has been proven to be well suited when solving Covering Arrays with Constraints. However, the size of the SAT reformulations rapidly grows up with higher strengths. To this end, we present a new incomplete algorithm that mitigates substantially memory blow-ups. The experimental results confirm the goodness of the approach, opening avenues for new practical applications
Semantic Fuzzing with Zest
Programs expecting structured inputs often consist of both a syntactic
analysis stage, which parses raw input, and a semantic analysis stage, which
conducts checks on the parsed input and executes the core logic of the program.
Generator-based testing tools in the lineage of QuickCheck are a promising way
to generate random syntactically valid test inputs for these programs. We
present Zest, a technique which automatically guides QuickCheck-like
randominput generators to better explore the semantic analysis stage of test
programs. Zest converts random-input generators into deterministic parametric
generators. We present the key insight that mutations in the untyped parameter
domain map to structural mutations in the input domain. Zest leverages program
feedback in the form of code coverage and input validity to perform
feedback-directed parameter search. We evaluate Zest against AFL and QuickCheck
on five Java programs: Maven, Ant, BCEL, Closure, and Rhino. Zest covers
1.03x-2.81x as many branches within the benchmarks semantic analysis stages as
baseline techniques. Further, we find 10 new bugs in the semantic analysis
stages of these benchmarks. Zest is the most effective technique in finding
these bugs reliably and quickly, requiring at most 10 minutes on average to
find each bug.Comment: To appear in Proceedings of 28th ACM SIGSOFT International Symposium
on Software Testing and Analysis (ISSTA'19
Threats to the validity of mutation-based test assessment
Much research on software testing and test techniques relies on experimental studies based on mutation testing. In this paper we reveal that such studies are vulnerable to a potential threat to validity, leading to possible Type I errors; incorrectly rejecting the Null Hypothesis. Our findings indicate that Type I errors occur, for arbitrary experiments that fail to take countermeasures, approximately 62% of the time. Clearly, a Type I error would potentially compromise any scientific conclusion. We show that the problem derives from such studies’ combined use of both subsuming and subsumed mutants. We collected articles published in the last two years at three leading software engineering conferences. Of those that use mutation-based test assessment, we found that 68% are vulnerable to this threat to validity
Model-based integration testing technique using formal finite state behavioral models for component-based software
Many issues and challenges could be identified when considering integration testing of Component-Based Software Systems (CBSS). Consequently, several research have appeared in the literature, aimed at facilitating the integration testing of CBSS. Unfortunately, they suffer from a number of drawbacks and limitations such as difficulty of understanding and describing the behavior of integrated components, lack of effective formalism for test information, difficulty of analyzing and validating the integrated components, and exposing the components implementation by providing semi-formal models. Hence, these problems have made it in effective to test today’s modern complex CBSS. To address these problems, a model-based approach such as Model-Based Testing (MBT) tends to be a suitable mechanism and could be a potential solution to be applied in the context of integration testing of CBSS. Accordingly, this thesis presents a model-based integration testing technique for CBSS. Firstly, a method to extract the formal finite state behavioral models of integrated software components using Mealy machine models was developed. The extracted formal models were used to detect faulty interactions (integration bugs) or compositional problems between integrated components in the system. Based on the experimental results, the proposed method had significant impact in reducing the number of output queries required to extract the formal models of integrated software components and its performance was 50% better compared to the existing methods. Secondly, based on the extracted formal models, an effective model-based integration testing technique (MITT) for CBSS was developed. Finally, the effectiveness of the MITT was demonstrated by employing it in the air gourmet and elevator case studies, using three evaluation parameters. The experimental results showed that the MITT was effective and outperformed Shahbaz technique on the air gourmet and elevator case studies. In terms of learned components for air gourmet and elevator case studies respectively, the MITT results were better by 98.14% and 100%, output queries based on performance were 42.13% and 25.01%, and error detection capabilities were 70.62% and 75% for each of the case study
Directed Test Program Generation for JIT Compiler Bug Localization
Bug localization techniques for Just-in-Time (JIT) compilers are based on
analyzing the execution behaviors of the target JIT compiler on a set of test
programs generated for this purpose; characteristics of these test inputs can
significantly impact the accuracy of bug localization. However, current
approaches for automatic test program generation do not work well for bug
localization in JIT compilers. This paper proposes a novel technique for
automatic test program generation for JIT compiler bug localization that is
based on two key insights: (1) the generated test programs should contain both
passing inputs (which do not trigger the bug) and failing inputs (which trigger
the bug); and (2) the passing inputs should be as similar as possible to the
initial seed input, while the failing programs should be as different as
possible from it. We use a structural analysis of the seed program to determine
which parts of the code should be mutated for each of the passing and failing
cases. Experiments using a prototype implementation indicate that test inputs
generated using our approach result in significantly improved bug localization
results than existing approaches
- …