2,957 research outputs found
Cause Clue Clauses: Error Localization using Maximum Satisfiability
Much effort is spent everyday by programmers in trying to reduce long,
failing execution traces to the cause of the error. We present a new algorithm
for error cause localization based on a reduction to the maximal satisfiability
problem (MAX-SAT), which asks what is the maximum number of clauses of a
Boolean formula that can be simultaneously satisfied by an assignment. At an
intuitive level, our algorithm takes as input a program and a failing test, and
comprises the following three steps. First, using symbolic execution, we encode
a trace of a program as a Boolean trace formula which is satisfiable iff the
trace is feasible. Second, for a failing program execution (e.g., one that
violates an assertion or a post-condition), we construct an unsatisfiable
formula by taking the trace formula and additionally asserting that the input
is the failing test and that the assertion condition does hold at the end.
Third, using MAX-SAT, we find a maximal set of clauses in this formula that can
be satisfied together, and output the complement set as a potential cause of
the error. We have implemented our algorithm in a tool called bug-assist for C
programs. We demonstrate the surprising effectiveness of the tool on a set of
benchmark examples with injected faults, and show that in most cases,
bug-assist can quickly and precisely isolate the exact few lines of code whose
change eliminates the error. We also demonstrate how our algorithm can be
modified to automatically suggest fixes for common classes of errors such as
off-by-one.Comment: The pre-alpha version of the tool can be downloaded from
http://bugassist.mpi-sws.or
Doctor of Philosophy
dissertationCompilers are indispensable tools to developers. We expect them to be correct. However, compiler correctness is very hard to be reasoned about. This can be partly explained by the daunting complexity of compilers. In this dissertation, I will explain how we constructed a random program generator, Csmith, and used it to find hundreds of bugs in strong open source compilers such as the GNU Compiler Collection (GCC) and the LLVM Compiler Infrastructure (LLVM). The success of Csmith depends on its ability of being expressive and unambiguous at the same time. Csmith is composed of a code generator and a GTAV (Generation-Time Analysis and Validation) engine. They work interactively to produce expressive yet unambiguous random programs. The expressiveness of Csmith is attributed to the code generator, while the unambiguity is assured by GTAV. GTAV performs program analyses, such as points-to analysis and effect analysis, efficiently to avoid ambiguities caused by undefined behaviors or unspecifed behaviors. During our 4.25 years of testing, Csmith has found over 450 bugs in the GNU Compiler Collection (GCC) and the LLVM Compiler Infrastructure (LLVM). We analyzed the bugs by putting them into different categories, studying the root causes, finding their locations in compilers' source code, and evaluating their importance. We believe analysis results are useful to future random testers, as well as compiler writers/users
Finding Bugs in Web Applications Using Dynamic Test Generation and Explicit State Model Checking
Web script crashes and malformed dynamically-generated web pages are common errors, and they seriously impact the usability of web applications. Current tools for web-page validation cannot handle the dynamically generated pages that are ubiquitous on today's Internet. We present a dynamic test generation technique for the domain of dynamic web applications. The technique utilizes both combined concrete and symbolic execution and explicit-state model checking. The technique generates tests automatically, runs the tests capturing logical constraints on inputs, and minimizes the conditions on the inputs to failing tests, so that the resulting bug reports are small and useful in finding and fixing the underlying faults. Our tool Apollo implements the technique for the PHP programming language. Apollo generates test inputs for a web application, monitors the application for crashes, and validates that the output conforms to the HTML specification. This paper presents Apollo's algorithms and implementation, and an experimental evaluation that revealed 302 faults in 6 PHP web applications
Finding Bugs In Dynamic Web Applications
Web script crashes and malformed dynamically-generated web pages are common errors, and they seriously impact usability of web applications. Currenttools for web-page validation cannot handle the dynamically-generatedpages that are ubiquitous on today's Internet.In this work, we apply a dynamic test generation technique, based oncombined concrete and symbolic execution, to the domain of dynamic webapplications. The technique generates tests automatically andminimizes the bug-inducing inputs to reduce duplication and to makethe bug reports small and easy to understand and fix.We implemented the technique in Apollo, an automated tool thatfound dozens of bugs in real PHP applications. Apollo generatestest inputs for the web application, monitors the application forcrashes, and validates that the output conforms to the HTMLspecification. This paper presents Apollo's algorithms andimplementation, and an experimental evaluation that revealed a totalof 214 bugs in 4 open-source PHP web applications
Configuring Test Generators using Bug Reports: A Case Study of GCC Compiler and Csmith
The correctness of compilers is instrumental in the safety and reliability of
other software systems, as bugs in compilers can produce executables that do
not reflect the intent of programmers. Such errors are difficult to identify
and debug. Random test program generators are commonly used in testing
compilers, and they have been effective in uncovering bugs. However, the
problem of guiding these test generators to produce test programs that are more
likely to find bugs remains challenging. In this paper, we use the code
snippets in the bug reports to guide the test generation. The main idea of this
work is to extract insights from the bug reports about the language features
that are more prone to inadequate implementation and using the insights to
guide the test generators. We use the GCC C compiler to evaluate the
effectiveness of this approach. In particular, we first cluster the test
programs in the GCC bugs reports based on their features. We then use the
centroids of the clusters to compute configurations for Csmith, a popular test
generator for C compilers. We evaluated this approach on eight versions of GCC
and found that our approach provides higher coverage and triggers more
miscompilation failures than the state-of-the-art test generation techniques
for GCC.Comment: The 36th ACM/SIGAPP Symposium on Applied Computing, Software
Verification and Testing Track (SAC-SVT'21
- …