Search CORE

703 research outputs found

Help, help, Im being suppressed the significance of suppressors in software testing

Author: Groce Alex
Regehr John
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

pre-printAbstract-Test features are basic compositional units used to describe what a test does (and does not) involve. For example, in API-based testing, the most obvious features are function calls; in grammar-based testing, the obvious features are the elements of the grammar. The relationship between features as abstractions of tests and produced behaviors of the tested program is surprisingly poorly understood. This paper shows how large-scale random testing modified to use diverse feature sets can uncover causal relationships between what a test contains and what the program being tested does. We introduce a general notion of observable behaviors as targets, where a target can be a detected fault, an executed branch or statement, or a complex coverage entity such as a state, predicate-valuation, or program path. While it is obvious that targets have triggers - features without which they cannot be hit by a test - the notion of suppressors - features which make a test less likely to hit a target - has received little attention despite having important implications for automated test generation and program understanding. For a set of subjects including C compilers, a flash file system, and JavaScript engines, we show that suppression is both common and important

The University of Utah: J. Willard Marriott Digital Library

Recommended from our members

Establishing Flight Software Reliability: Testing, Model Checking, Constraint-Solving, Monitoring and Learning

Author: Groce Alex
Havelund Klaus
Holzmann Gerard
Joshi Rajeev
Xu Ru-Gang
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date
Field of study

In this paper we discuss the application of a range of techniques to the verification of mission-critical flight software at NASA’s Jet Propulsion Laboratory. For this type of application we want to achieve a higher level of confidence than can be achieved through standard software testing. Unfortunately, given the current state of the art, especially when efforts are constrained by the tight deadlines and resource limitations of a flight project, it is not feasible to produce a rigorous formal proof of correctness of even a well-specified stand-alone module such as a file system (much less more tightly coupled or difficult-to-specify modules). This means that we must look for a practical alternative in the area between traditional testing and proof, as we attempt to optimize rigor and coverage. The approaches we describe here are based on testing, model checking, constraint-solving, monitoring, and finite-state machine learning, in addition to static code analysis. The results we have obtained in the domain of file systems are encouraging, and suggest that for more complex properties of programs with complex data structures, it is possibly more beneficial to use constraint solvers to guide and analyze execution (i.e., as in testing, even if performed by a model checking tool) than to translate the program and property into a set of constraints, as in abstraction-based and bounded model checkers. Our experience with non-file-system flight software modules shows that methods even further removed from traditional static formal methods can be assisted by formal approaches, yet readily adopted by test engineers and software developers, even as the key problem shifts from test generation and selection to test evaluation.Keywords: Verification, Formal proof, Flight software, File systems, Model checking, Testin

ScholarsArchive@OSU

Using test case reduction and prioritization to improve symbolic execution

Author: Alex Groce
Chaoqiang Zhang
Mohammad Amin Alipour
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

Scaling symbolic execution to large programs or programs with complex inputs remains difficult due to path explosion and complex constraints, as well as external method calls. Additionally, creating an effective test structure with sym-bolic inputs can be difficult. A popular symbolic execution strategy in practice is to perform symbolic execution not “from scratch ” but based on existing test cases. This paper proposes that the effectiveness of this approach to symbolic execution can be enhanced by (1) reducing the size of seed test cases and (2) prioritizing seed test cases to maximize ex-ploration efficiency. The proposed test case reduction strat-egy is based on a recently introduced generalization of delta-debugging, and our prioritization techniques include novel methods that, for this purpose, can outperform some tradi-tional regression testing algorithms. We show that applying these methods can significantly improve the effectiveness of symbolic execution based on existing test cases

CiteSeerX

Crossref

Swarm testing

Author: Groce Alex
Regehr John
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2012
Field of study

ManuscriptSwarm testing is a novel and inexpensive way to improve the diversity of test cases generated during random testing. Increased diversity leads to improved coverage and fault detection. In swarm testing, the usual practice of potentially including all features in every test case is abandoned. Rather, a large "swarm" of randomly generated configurations, each of which omits some features, is used, with configurations receiving equal resources. We have identified two mechanisms by which feature omission leads to better exploration of a system's state space. First, some features actively prevent the system from executing interesting behaviors; e.g., "pop" calls may prevent a stack data structure from executing a bug in its overflow detection logic. Second, even when there is no active suppression of behaviors, test features compete for space in each test, limiting the depth to which logic driven by features can be explored. Experimental results show that swarm testing increases coverage and can improve fault detection dramatically; for example, in a week of testing it found 42% more distinct ways to crash a collection of C compilers than did the heavily hand-tuned default configuration of a random tester

The University of Utah: J. Willard Marriott Digital Library

A Systematic Mapping Study on Test Automation Approaches for Flight Simulators

Author: Jensén Linus
Publication venue
Publication date: 01/01/2022
Field of study

Background: Test automation is the practice of running tests automatically for the purpose of improving software quality. This is a field that is used in many kinds of areas. Flight simulators are software and hardware that help us achieve flight training and testing without flying. Test automation for flight simulators aims to improve the testing processes for this kind of software. Objective: To present and provide a comprehensive, unbiased overview of the state of test automation approaches for flight simulators Method: A Systematic Mapping Study (SMS) of the existing test automation approaches for flight simulators. Results: 20 papers whose main topic or mentions test automation approaches for flight simulators. The results show that there are multiple ways of achieving test automation, with the purpose of speeding up the testing process and this is mostly on the software level. This field is growing and has since 2007 grown with almost yearly publications. Conclusions: This is a growing field, and it will likely continue to grow as technology advances and new testing methods and flight software emerges. Currently there is not any go to approach that is widely used, although there are several methods that are slightly more popular than others

National Library of Finland DSpace Services

Cause reduction for quick testing

Author: Groce Alex
Regehr John
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

pre-printAbstract-In random testing, it is often desirable to produce a "quick test" - an extremely inexpensive test suite that can serve as a frequently applied regression and allow the benefits of random testing to be obtained even in very slow or oversubscribed test environments. Delta debugging is an algorithm that, given a failing test case, produces a smaller test case that also fails, and typically executes much more quickly. Delta debugging of random tests can produce effective regression suites for previously detected faults, but such suites often have little power for detecting new faults, and in some cases provide poor code coverage. This paper proposes extending delta debugging by simplifying tests with respect to code coverage, an instance of a generalization of delta debugging we call cause reduction. We show that test suites reduced in this fashion can provide very effective quick tests for real-world programs. For Mozilla's SpiderMonkey JavaScript engine, the reduced suite is more effective for finding software faults, even if its reduced runtime is not considered. The effectiveness of a reduction-based quick test persists through major changes to the software under test

The University of Utah: J. Willard Marriott Digital Library

Doctor of Philosophy

Author: Chen Yang
Publication venue: University of Utah
Publication date: 01/05/2014
Field of study

dissertationAggressive random testing tools, or fuzzers, are impressively effective at finding bugs in compilers and programming language runtimes. For example, a single test-case generator has resulted in more than 460 bugs reported for a number of production-quality C compilers. However, fuzzers can be hard to use. The first problem is that failures triggered by random test cases can be difficult to debug because these tests are often large. To report a compiler bug, one must often construct a small test case that triggers the bug. The existing automated test-case reduction technique, delta debugging, is not sufficient to produce small, reportable test cases. A second problem is that fuzzers are indiscriminate: they repeatedly find bugs that may not be severe enough to fix right away. Third, fuzzers tend to generate a large number of test cases that only trigger a few bugs. Some bugs are triggered much more frequently than others, creating needle-in-the-haystack problems. Currently, users rule out undesirable test cases using ad hoc methods such as disallowing problematic features in tests and filtering test results. This dissertation investigates approaches to improving the utility of compiler fuzzers. Two components, an aggressive test-case reducer and a tamer, are added to the fuzzing workflow to make the fuzzer more user friendly. We introduce C-Reduce, an aggressive test-case reducer for C/C++ programs, which exploits rich domain-specific knowledge to output test cases nearly as good as those produced by skilled humans. This reducer produces outputs that are, on average, more than 30 times smaller than those produced by the existing reducer that is most commonly used by compiler engineers. Second, this dissertation formulates and addresses the fuzzer taming problem: given a potentially large number of random test cases that trigger failures, order them such that diverse, interesting test cases are highly ranked. Bug triage can be effectively automated, relying on techniques from machine learning to suppress duplicate bug-triggering test cases and test cases triggering known bugs. An evaluation shows the ability of this tool to solve the fuzzer taming problem for 3,799 test cases triggering 46 bugs in a C compiler

The University of Utah: J. Willard Marriott Digital Library