Search CORE

4,926 research outputs found

Synthesizing Program Input Grammars

Author: Albarghouthi A.
Cadar C.
Cho C. Y.
Forrester J. E.
Godefroid P.
Holler C.
Huang L.
Lee L.
Oncina J.
Solomonoff R. J.
Sutton M.
Sutton M.
Vardhan A.
Viide J.
Wondracek G.
Publication venue
Publication date: 16/06/2017
Field of study

We present an algorithm for synthesizing a context-free grammar encoding the language of valid program inputs from a set of input examples and blackbox access to the program. Our algorithm addresses shortcomings of existing grammar inference algorithms, which both severely overgeneralize and are prohibitively slow. Our implementation, GLADE, leverages the grammar synthesized by our algorithm to fuzz test programs with structured inputs. We show that GLADE substantially increases the incremental coverage on valid inputs compared to two baseline fuzzers

arXiv.org e-Print Archive

Crossref

Recommended from our members

Fault-based regression testing in a reactive environment

Author: Richardson Debra J.
Publication venue: eScholarship, University of California
Publication date: 01/01/1989
Field of study

Regression testing is the process of retesting software after modification. Regression testing is a major factor contributing to the high cost of software maintenance. To control this cost, regression testing must be accomplished efficiently through effective reuse of test cases and judicious generation of new test cases.Fault-based testing focuses on the detection of particular classes of faults. RELAY is a fault-based testing technique that guarantees the detection of errors caused by any fault in a chosen fault classification. RELAY can be used as a regression testing technique to generate the test cases required to demonstrate that a modification is properly made. In addition, the information related to a test case chosen to detect a potential fault guides in choosing previously-selected test cases that should be reused, for a given modification.This paper presents the concepts behind RELAY and discusses how RELAY could be used as a regression testing technique. It also describes a testing environment that supports reactive regression testing as well as testing throughout the development lifecycle, which is based on integrating the RELAY model with other testing techniques

eScholarship - University of California

Stateful Testing: Finding More Errors in Code and Contracts

Author: Furia Carlo A.
Horton Alexander
Meyer Bertrand
Nordio Martin
Pei Yu
Roth Hannes
Steindorfer Michael
Wei Yi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/08/2011
Field of study

Automated random testing has shown to be an effective approach to finding faults but still faces a major unsolved issue: how to generate test inputs diverse enough to find many faults and find them quickly. Stateful testing, the automated testing technique introduced in this article, generates new test cases that improve an existing test suite. The generated test cases are designed to violate the dynamically inferred contracts (invariants) characterizing the existing test suite. As a consequence, they are in a good position to detect new errors, and also to improve the accuracy of the inferred contracts by discovering those that are unsound. Experiments on 13 data structure classes totalling over 28,000 lines of code demonstrate the effectiveness of stateful testing in improving over the results of long sessions of random testing: stateful testing found 68.4% new errors and improved the accuracy of automatically inferred contracts to over 99%, with just a 7% time overhead.Comment: 11 pages, 3 figure

arXiv.org e-Print Archive

Crossref

Language Modeling by Clustering with Word Embeddings for Text Readability Assessment

Author: Chall J.S.
Flor Michael
Le Quoc V
Stenner A.J.
Publication venue
Publication date: 04/09/2017
Field of study

We present a clustering-based language model using word embeddings for text readability prediction. Presumably, an Euclidean semantic space hypothesis holds true for word embeddings whose training is done by observing word co-occurrences. We argue that clustering with word embeddings in the metric space should yield feature representations in a higher semantic space appropriate for text regression. Also, by representing features in terms of histograms, our approach can naturally address documents of varying lengths. An empirical evaluation using the Common Core Standards corpus reveals that the features formed on our clustering-based language model significantly improve the previously known results for the same corpus in readability prediction. We also evaluate the task of sentence matching based on semantic relatedness using the Wiki-SimpleWiki corpus and find that our features lead to superior matching performance

arXiv.org e-Print Archive

Crossref

Automatically Generating Random Test Data for Relevant and Implicitly Defined Subdomains

Author: Murphy John Alexander
Publication venue: W&M ScholarWorks
Publication date: 01/01/2008
Field of study

College of William & Mary: W&M Publish