Scalable Logic Defined Static Analysis

Abstract

Logic languages such as Datalog have been proposed as a method for specifying flexible and customisable static analysers. Using Datalog, various classes of static analyses can be expressed precisely and succinctly, requiring fewer lines of code than hand-crafted analysers. In this paradigm, a static analysis specification is encoded by a set of declarative logic rules and an o -the-shelf solver is used to compute the result of the static analysis. Unfortunately, when large-scale analyses are employed, Datalog-based tools currently fail to scale in comparison to hand-crafted static analysers. As a result, Datalog-based analysers have largely remained an academic curiosity, rather than industrially respectful tools. This thesis outlines our e orts in understanding the sources of performance limitations in Datalog-based tools. We propose a novel evaluation technique that is predicated on the fact that in the case of static analysis, the logical specification is a design time artefact and hence does not change during evaluation. Thus, instead of directly evaluating Datalog rules, our approach leverages partial evaluation to synthesise a specialised static analyser from these rules. This approach enables a novel indexing optimisations that automatically selects an optimal set of indexes to speedup and minimise memory usage in the Datalog computation. Lastly, we explore the case of more expressive logics, namely, constrained Horn clause and their use in proving the correctness of programs. We identify a bottleneck in various symbolic evaluation algorithms that centre around Craig interpolation. We propose a method of improving these evaluation algorithms by a proposing a method of guiding theorem provers to discover relevant interpolants with respect to the input logic specification. The culmination of our work is implemented in a general-purpose and highperformance tool called Souffl´e. We describe Souffl´e and evaluate its performance experimentally, showing significant improvement over alternative techniques and its scalability in real-world industrial use cases

    Similar works