Search CORE

4 research outputs found

Differentially Testing Soundness and Precision of Program Analyzers

Author: Amato Gianluca
Besson Frédéric
Beyer Dirk
Blazy Sandrine
Bradley Aaron R.
Clarke Edmund M.
Cousot Patrick
de Moura Leonardo
Dolan-Gavitt Brendan
Donaldson Alastair F.
Dubois Catherine
Gange Graeme
Graf Susanne
Gurfinkel Arie
Heizmann Matthias
Komuravelli Anvesh
Midtgaard Jan
Publication venue
Publication date: 16/12/2018
Field of study

In the last decades, numerous program analyzers have been developed both by academia and industry. Despite their abundance however, there is currently no systematic way of comparing the effectiveness of different analyzers on arbitrary code. In this paper, we present the first automated technique for differentially testing soundness and precision of program analyzers. We used our technique to compare six mature, state-of-the art analyzers on tens of thousands of automatically generated benchmarks. Our technique detected soundness and precision issues in most analyzers, and we evaluated the implications of these issues to both designers and users of program analyzers

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Higher-Order, Data-Parallel Structured Deduction

Author: Gilray Thomas
Kumar Sidharth
Micinski Kristopher
Sahebolamri Arash
Publication venue
Publication date: 21/11/2022
Field of study

State-of-the-art Datalog engines include expressive features such as ADTs (structured heap values), stratified aggregation and negation, various primitive operations, and the opportunity for further extension using FFIs. Current parallelization approaches for state-of-art Datalogs target shared-memory locking data-structures using conventional multi-threading, or use the map-reduce model for distributed computing. Furthermore, current state-of-art approaches cannot scale to formal systems which pervasively manipulate structured data due to their lack of indexing for structured data stored in the heap. In this paper, we describe a new approach to data-parallel structured deduction that involves a key semantic extension of Datalog to permit first-class facts and higher-order relations via defunctionalization, an implementation approach that enables parallelism uniformly both across sets of disjoint facts and over individual facts with nested structure. We detail a core language,

DL_s

, whose key invariant (subfact closure) ensures that each subfact is materialized as a top-class fact. We extend

DL_s

to Slog, a fully-featured language whose forms facilitate leveraging subfact closure to rapidly implement expressive, high-performance formal systems. We demonstrate Slog by building a family of control-flow analyses from abstract machines, systematically, along with several implementations of classical type systems (such as STLC and LF). We performed experiments on EC2, Azure, and ALCF's Theta at up to 1000 threads, showing orders-of-magnitude scalability improvements versus competing state-of-art systems

arXiv.org e-Print Archive

Deconstructing Datalog

Author: Arntzenius Michael
Publication venue
Publication date: 07/12/2022
Field of study

The deductive query language Datalog has found a wide array of uses, including static analy- sis (Smaragdakis and Bravenboer, 2010), business analytics (Aref et al., 2015), and distributed programming (Alvaro et al., 2010, 2011). Datalog is high-level and declarative, but simple and well-studied enough to admit efficient implementation strategies. For example, Whaley et al. found they could replace a hand-tuned C implementation of context-sensitive pointer analysis with a comparably-performing Datalog program that was 100x smaller (Whaley and Lam, 2004; Whaley et al., 2005). However, Datalog’s semantics are not stable under extensions. For instance, adding arithmetic operations breaks Datalog’s termination guarantee. Despite this, nearly all practical implementations extend Datalog beyond its theoretical core to add niceties such as arithmetic, datatypes, aggregations, and so on. Moreover, pure Datalog cannot abstract over repeated code: one may express a static analysis over a particular program, but to express the same analysis over multiple programs, one must duplicate the analysis code for each program analyzed. This thesis deconstructs Datalog from a categorical and type theoretic perspective to determine what makes it tick. Datalog’s semantic guarantees are provided by brute syntactic restrictions, such as stratification and the absence of function symbols. In place of these, we find compositional semantic properties such as monotonicity, which we capture using types. We show that this permits integrating Datalog’s features with those of typed functional languages, such as algebraic data types and higher order functions. In particular, this thesis makes the following contributions: 1. We define and expound the semantics and metatheory of Datafun, a pure and total higher-order typed functional language capturing the essence of Datalog. Where Data- log has predicates defined by a restricted class of Horn clauses, Datafun has finite sets and set comprehensions; Datalog’s bottom-up recursive queries become iterative fixed points; and Datalog’s stratification condition becomes a matter of tracking monotonicity with types. 2. We show how to generalize seminaïve evaluation to handle higher-order functions. Seminaïve evaluation is a technique from the Datalog literature which improves the performance of Datalog’s most distinctive feature: recursive queries. These are com- puted iteratively, and under a naïve evaluation strategy, each iteration recomputes all previous values. Seminaïve evaluation computes a safe approximation of the difference between iterations. This can asymptotically improve the performance of Datalog queries. Seminaïve evaluation is defined partly as a program transformation and partly as a modified iteration strategy, and takes advantage of the first-order nature of Datalog. We extend this transformation to handle higher-order programs written in Datafun. 3. In the process of generalizing seminaïve evaluation, we uncover a theory of incremental, monotone, higher-order computation, in which values change over time by growing larger, and programs respond incrementally to these increases

University of Birmingham Research Archive, E-theses Repository

Safe and sound program analysis with Flix

Author: Lhoták Ondřej
Madsen Magnus
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 12/07/2018
Field of study

Crossref

VBN