2,027 research outputs found
Abstract Interpretation with Unfoldings
We present and evaluate a technique for computing path-sensitive interference
conditions during abstract interpretation of concurrent programs. In lieu of
fixed point computation, we use prime event structures to compactly represent
causal dependence and interference between sequences of transformers. Our main
contribution is an unfolding algorithm that uses a new notion of independence
to avoid redundant transformer application, thread-local fixed points to reduce
the size of the unfolding, and a novel cutoff criterion based on subsumption to
guarantee termination of the analysis. Our experiments show that the abstract
unfolding produces an order of magnitude fewer false alarms than a mature
abstract interpreter, while being several orders of magnitude faster than
solver-based tools that have the same precision.Comment: Extended version of the paper (with the same title and authors) to
appear at CAV 201
Targeted Greybox Fuzzing with Static Lookahead Analysis
Automatic test generation typically aims to generate inputs that explore new
paths in the program under test in order to find bugs. Existing work has,
therefore, focused on guiding the exploration toward program parts that are
more likely to contain bugs by using an offline static analysis.
In this paper, we introduce a novel technique for targeted greybox fuzzing
using an online static analysis that guides the fuzzer toward a set of target
locations, for instance, located in recently modified parts of the program.
This is achieved by first semantically analyzing each program path that is
explored by an input in the fuzzer's test suite. The results of this analysis
are then used to control the fuzzer's specialized power schedule, which
determines how often to fuzz inputs from the test suite. We implemented our
technique by extending a state-of-the-art, industrial fuzzer for Ethereum smart
contracts and evaluate its effectiveness on 27 real-world benchmarks. Using an
online analysis is particularly suitable for the domain of smart contracts
since it does not require any code instrumentation---instrumentation to
contracts changes their semantics. Our experiments show that targeted fuzzing
significantly outperforms standard greybox fuzzing for reaching 83% of the
challenging target locations (up to 14x of median speed-up)
An executable formal semantics of PHP with applications to program analysis
Nowadays, many important activities in our lives involve the web. However, the software and protocols on which web applications are based were not designed with the appropriate level of security in mind. Many web applications have reached a level of complexity for which testing, code reviews and human inspection are no longer sufficient quality-assurance guarantees. Tools that employ static analysis techniques are needed in order to explore all possible execution paths through an application and guarantee the absence of undesirable behaviours. To make sure that an analysis captures the properties of interest, and to navigate the trade-offs between efficiency and precision, it is necessary to base the design and the development of static analysis tools on a firm understanding of the language to be analysed. When this underlying knowledge is missing or erroneous, tools can’t be trusted no matter what advanced techniques they use to perform their task. In this Thesis, we introduce KPHP, the first executable formal semantics of PHP, one of the most popular languages for server-side web programming. Then, we demonstrate its practical relevance by developing two verification tools, of increasing complexity, on top of it - a simple verifier based on symbolic execution and LTL model checking and a general purpose, fully configurable and extensible static analyser based on Abstract Interpretation. Our LTL-based tool leverages the existing symbolic execution and model checking support offered by K, our semantics framework of choice, and constitutes a first proof-of-concept of the usefulness of our semantics. Our abstract interpreter, on the other hand, represents a more significant and novel contribution to the field of static analysis of dynamic scripting languages (PHP in particular). Although our tool is still a prototype and therefore not well suited for handling large real-world codebases, we demonstrate how our semantics-based, principled approach to the development of verification tools has lead to the design of static analyses that outperform existing tools and approaches, both in terms of supported language features, precision, and breadth of possible applications.Open Acces
Abstract Execution: Automatically Proving Infinitely Many Programs
Abstract programs contain schematic placeholders representing potentially infinitely many concrete programs. They naturally occur in multiple areas of computer science concerned with correctness: rule-based compilation and optimization, code refactoring and other source-to-source transformations, program synthesis, Correctness-by-Construction, and more. Mechanized correctness arguments about abstract programs are frequently conducted in interactive environments. While this permits expressing arbitrary properties quantifying over programs, substantial effort has to be invested to prove them manually by writing proof scripts. Existing approaches to proving abstract program properties automatically, on the other hand, lack expressiveness. Frequently, they only support placeholders representing all possible instantiations; in some cases, minor refinements are supported.
This thesis bridges that gap by presenting Abstract Execution (AE), an automatic reasoning technique for universal behavioral properties of abstract programs. The restriction to universal (no existential quantification) and behavioral (not addressing internal structure) properties excludes certain applications; however, it is the key to automation. Our logic for Abstract Execution uses abstract state changes to represent unknown effects on local variables and the heap, and models abrupt completion by symbolic branching. In this logic, schematic placeholders have names: It is possible to re-use them at several places, representing the same program elements in potentially different contexts. Furthermore, the represented concrete programs can be constrained by an expressive specification language, which is a unique feature of AE. We use the theory of dynamic frames to scale between full abstraction and total precision of frame specifications, and support fine-grained pre- and postconditions for (abrupt) completion.
We implemented AE by extending the program verifier KeY. Specifically for relational verification of abstract Java programs, we developed REFINITY, a graphical KeY frontend. We used REFINITY it in our signature application of AE: to model well-known statement-level refactoring techniques and prove their conditional safety. Several yet undocumented behavioral preconditions for safe refactorings originated in this case study, which is one of very few attempts to statically prove behavioral correctness of statement-level refactorings, and the only one to cover them to that extent.
AE extends Symbolic Execution (SE) for abstract programs. As a foundational contribution, we propose a general framework for SE based on the semantics of symbolic states. It natively integrates state merging by supporting m-to-n transitions. We define two orthogonal correctness notions, exhaustiveness and precision, and formally prove their relation to program proving and bug detection.
Finally, we introduce Modal Trace Logic (MTL), a trace-based logic to represent a variety of different program verification tasks, especially for relational verification. It is a “plug-in” logic which can be integrated on-demand with formal languages that have a trace semantics. The core of MTL is the trace modality, which allows expressing that a specification approximates an implementation after a trace abstraction step. We demonstrate the versatility of this approach by formalizing concrete verification tasks in MTL, ranging from functional verification over program synthesis to program evolution. To reason about MTL problems, we translate them to symbolic traces. We suggest Symbolic Trace Logic (STL), which comes with a sequent calculus to prove symbolic trace inclusions. This requires checking symbolic states for subsumption; to that end, we provide two generally useful notions of symbolic state subsumption. This framework relates as follows to the other parts of this thesis: We use the language of abstract programs to express synthesis and compilation, which connects MTL to AE. Moreover, symbolic states of STL are based on our framework for SE
Merging Techniques for Faster Derivation of WCET Flow Information using Abstract Execution
Static Worst-Case Execution Time (WCET) analysis derives upper bounds for the execution times of programs. Such bounds are crucial when designing and verifying real-time systems. A key component in static WCET analysis is to derive flow information, such as loop bounds and infeasible paths. We have previously introduced abstract execution (AE), a method capable of deriving very precise flow information. This paper present different merging techniques that can be used by AE for trading analysis time for flow information precision. It also presents a new technique, ordered merging, which may radically shorten AE analysis times, especially when analyzing large programs with many possible input variable values
A Static Analyzer for Large Safety-Critical Software
We show that abstract interpretation-based static program analysis can be
made efficient and precise enough to formally verify a class of properties for
a family of large programs with few or no false alarms. This is achieved by
refinement of a general purpose static analyzer and later adaptation to
particular programs of the family by the end-user through parametrization. This
is applied to the proof of soundness of data manipulation operations at the
machine level for periodic synchronous safety critical embedded software. The
main novelties are the design principle of static analyzers by refinement and
adaptation through parametrization, the symbolic manipulation of expressions to
improve the precision of abstract transfer functions, the octagon, ellipsoid,
and decision tree abstract domains, all with sound handling of rounding errors
in floating point computations, widening strategies (with thresholds, delayed)
and the automatic determination of the parameters (parametrized packing)
- …