275 research outputs found
A Survey of Symbolic Execution Techniques
Many security and software testing applications require checking whether
certain properties of a program hold for any possible usage scenario. For
instance, a tool for identifying software vulnerabilities may need to rule out
the existence of any backdoor to bypass a program's authentication. One
approach would be to test the program using different, possibly random inputs.
As the backdoor may only be hit for very specific program workloads, automated
exploration of the space of possible inputs is of the essence. Symbolic
execution provides an elegant solution to the problem, by systematically
exploring many possible execution paths at the same time without necessarily
requiring concrete inputs. Rather than taking on fully specified input values,
the technique abstractly represents them as symbols, resorting to constraint
solvers to construct actual instances that would cause property violations.
Symbolic execution has been incubated in dozens of tools developed over the
last four decades, leading to major practical breakthroughs in a number of
prominent software reliability applications. The goal of this survey is to
provide an overview of the main ideas, challenges, and solutions developed in
the area, distilling them for a broad audience.
The present survey has been accepted for publication at ACM Computing
Surveys. If you are considering citing this survey, we would appreciate if you
could use the following BibTeX entry: http://goo.gl/Hf5FvcComment: This is the authors pre-print copy. If you are considering citing
this survey, we would appreciate if you could use the following BibTeX entry:
http://goo.gl/Hf5Fv
Technical Report: Region and Effect Inference for Safe Parallelism
In this paper, we present the first full regions-and-effects inference algorithm for explicitly parallel fork-join programs. We infer annotations inspired by Deterministic Parallel Java (DPJ) for a type-safe subset of C++. We chose the DPJ annotations because they give the \emph{strongest} safety guarantees of any existing concurrency-checking approach we know of, static or dynamic, and it is also the most expressive static checking system we know of that gives strong safety guarantees. This expressiveness, however, makes manual annotation difficult and tedious, which motivates the need for automatic inference,
but it also makes the inference problem very challenging: the code may use region polymorphism, imperative updates with complex aliasing,
arbitrary recursion, hierarchical region specifications, and wildcard elements to describe potentially infinite sets of regions. We express the inference as a constraint satisfaction problem and develop, implement, and evaluate an algorithm for solving it. The region and effect annotations inferred by the algorithm constitute a checkable proof of safe parallelism,
and it can be recorded both for documentation and for fast and modular
safety checking.Ope
Recommended from our members
Parametric Heap Abstraction for Dynamic Language Libraries
For commercial development, dynamic languages are growing in popularity. Consequently, dynamic language developers must consider the correctness of their code. Deployment of correct or sufficiently correct code is critical to the success and adoption of that code. However, it is challenging to ensure the correctness of dynamic language code when it is a library. Inputs to dynamic language libraries are often not simple values. They can also be open objects, which permit adding, removing, and iterating over attribute names, and they can be functions that may be called. Furthermore, the result of running library functions on objects is often new objects derived from the input objects.
To ensure the correctness of dynamic language libraries, this dissertation uses static analysis. Static analysis is typically used to infer facts about programs' values, but in these dynamic language libraries, values can be objects and functions. These objects may be unknown and thus have an unknown set of attribute names because they are inputs or these objects may be iteratively derived from other objects. Functions may be stored, called, or wrapped in other functions, regardless of if they are known or not. A static analysis for dynamic language libraries must handle these cases.
To support static analysis of libraries, this dissertation introduces local heap abstractions suitable for representing parts of memory that library code may affect. These abstractions build upon abstractions for sets that enable representing relations between attribute names of otherwise unknown objects. Furthermore, the local reasoning is utilized to abstract the effect of calling unknown functions.
The precision of these abstractions are demonstrated on a range of small, but complex JavaScript-inspired examples. The examples are extracted from libraries such as Traits.js and Google Closure
Template-based verification of heap-manipulating programs
We propose a shape analysis suitable for analysis engines that perform automatic invariant inference using an SMT solver. The proposed solution includes an abstract template domain that encodes the shape of a program heap based on logical formulae over bit-vectors. It is based on a points-to relation between pointers and symbolic addresses of abstract memory objects. Our abstract heap domain can be combined with value domains in a straight-forward manner, which particularly allows us to reason about shapes and contents of heap structures at the same time. The information obtained from the analysis can be used to prove reachability and memory safety properties of programs manipulating dynamic data structures, mainly linked lists. The solution has been implemented in 2LS and compared against state-of-the-art tools that perform the best in heap-related categories of the well-known Software Verification Competition (SV-COMP). Results show that 2LS outperforms these tools on benchmarks requiring combined reasoning about unbounded data structures and their numerical contents
Lock Inference for Java
Atomicity is an important property for concurrent software, as it provides a stronger guarantee
against errors caused by unanticipated thread interactions than race-freedom does. However,
concurrency control in general is tricky to get right because current techniques are too low-level
and error-prone. With the introduction of multicore processors, the problems are compounded.
Consequently, a new software abstraction is gaining popularity to take care of concurrency
control and the enforcing of atomicity properties, called atomic sections.
One possible implementation of their semantics is to acquire a global lock upon entry to each
atomic section, ensuring that they execute in mutual exclusion. However, this cripples concurrency,
as non-interfering atomic sections cannot run in parallel. Transactional memory is
another automated technique for providing atomicity, but relies on the ability to rollback conflicting atomic sections and thus places restrictions on the use of irreversible operations, such as
I/O and system calls, or serialises all sections that use such features. Therefore, from a language
designer's point of view, the challenge is to implement atomic sections without compromising
performance or expressivity.
This thesis explores the technique of lock inference, which infers a set of locks for each atomic
section, while attempting to balance the requirements of maximal concurrency, minimal locking
overhead and freedom from deadlock. We focus on lock-inference techniques for tackling large
Java programs that make use of mature libraries. This improves upon existing work, which
either (i) ignores libraries, (ii) requires library implementors to annotate which locks to take, or
(iii) only considers accesses performed up to one-level deep in library call chains. As a result,
each of these prior approaches may result in atomicity violations. This is a problem because
even simple uses of I/O in Java programs can involve large amounts of library code. Our
approach is the first to analyse library methods in full and thus able to soundly handle atomic
sections involving complicated real-world side effects, while still permitting atomic sections to
run concurrently in cases where their lock sets are disjoint.
To validate our claims, we have implemented our techniques in Lockguard, a fully automatic
tool that translates Java bytecode containing atomic sections to an equivalent program that
uses locks instead. We show that our techniques scale well and despite protecting all library
accesses, we obtain performance comparable to the original locking policy of our benchmarks
Deductive Synthesis and Repair
In this thesis, we explore techniques for the development of recursive functional programs over unbounded domains that are proved correct according to their high-level specifications. We present algorithms for automatically synthesizing executable code, starting from the speci- fication alone. We implement these algorithms in the Leon system. We augment relational specifications with a concise notation for symbolic tests, which are are helpful to characterize fragments of the functionsâ behavior. We build on our synthesis procedure to automatically repair invalid functions by generating alternative implementations. Our approach therefore formulates program repair in the framework of deductive synthesis and uses the existing program structure as a hint to guide synthesis. We rely on user-specified tests as well as automatically generated ones to localize the fault. This localization enables our procedure to repair functions that would otherwise be out of reach of our synthesizer, and ensures that most of the original behavior is preserved. We also investigate multiple ways of enabling Leon programs to interact with external, un- trusted code. For that purpose, we introduce a precise inter-procedural effect analysis for arbitrary Scala programs with mutable state, dynamic object allocation, and dynamic dispatch. We analyzed the Scala standard library containing 58000 methods and classified them into sev- eral categories according to their effects. Our analysis proves that over one half of all methods are pure, identifies a number of conditionally pure methods, and computes summary graphs and regular expressions describing the side effects of non-pure methods. We implement the synthesis and repair algorithms within the Leon system and deploy them as part of a novel interactive development environment available as a web interface. Our implementation is able to synthesize, within seconds, a number of useful recursive functions that manipulate unbounded numbers and data structures. Our repair procedure automatically locates various kinds of errors in recursive functions and fixes them by synthesizing alternative implementations
Automated Verification of Complete Specification with Shape Inference
Ph.DDOCTOR OF PHILOSOPH
A scalable mixed-level approach to dynamic analysis of C and C++ programs
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.Includes bibliographical references (p. 107-112).This thesis addresses the difficult task of constructing robust and scalable dynamic program analysis tools for programs written in memory-unsafe languages such as C and C++, especially those that are interested in observing the contents of data structures at run time. In this thesis, I first introduce my novel mixed-level approach to dynamic analysis, which combines the advantages of both source- and binary-based approaches. Second, I present a tool framework that embodies the mixed-level approach. This framework provides memory safety guarantees, allows tools built upon it to access rich source- and binary-level information simultaneously at run time, and enables tools to scale to large, real-world C and C++ programs on the order of millions of lines of code. Third, I present two dynamic analysis tools built upon my framework - one for performing value profiling and the other for performing dynamic inference of abstract types - and describe how they far surpass previous analyses in terms of scalability, robustness, and applicability. Lastly, I present several case studies demonstrating how these tools aid both humans and automated tools in several program analysis tasks: improving human understanding of unfamiliar code, invariant detection, and data structure repair.by Philip Jia Guo.M.Eng
- …