64,471 research outputs found

    Translation validation for compilation verification

    Get PDF
    Modern optimizing compilers such as LLVM and GCC are huge and complex, and mature releases routinely have uncaught bugs. Beyond harm to software development, the lack of formal correctness guarantees for the compilation process seriously limits the guarantees other software systems can provide, since the compiler that generates the final executable cannot be trusted. These circumstances have motivated broad interest in compilation verification: providing a formal guarantee that a compilation of a program is correct. Translation Validation is a commonly used compilation verification technique that aims to prove the correctness of a single instance of compilation, by considering only the specific input and output programs and treating the compiler mostly as a black box. Translation Validation techniques are well-suited to the compilation verification problem because they can be composed to validate a sequence of compilation steps, they can easily retrofit to existing compilers, and they can be maintained independently from the compiler itself by a separate team of formal method experts. The basic components of a Translation Validation system are (1) a formal notion of program equivalence, (2) a verification condition generator that generates a relation between program points and variables in the input and output programs, (3) a proof system that accepts the verification conditions, generates a machine-checkable equivalence proof, and checks the proof for correctness. Ideally, such a system is completely agnostic to the specifics of transformation from the input to the output as well as independent of the input/output languages. This allows the same system to be reused across the many transformation and translation passes found in modern compilers. However, this is not true in the state of the art: most existing systems are custom-tailored for a particular sequence of transformations, and moreover, specialized for a specific, common intermediate language for the input and output programs. The overall goal of this work is to show that it is possible to develop a (mostly) language-independent, transformation-agnostic translation validation system with support for different input/output languages for an optimizing, production-quality compiler. In this thesis, we present such a system as well as the theoretical and practical advances needed to arrive to it. First, we present a formal framework for program equivalence checking that is transformation-agnostic and language-independent. This framework can serve as-is as the proof system for any number of Translation Validation systems targeting different transformation and/or translation phases within an existing compiler. The basis of the framework is a rigorous formalization, namely cut-bisimulation, for weak bisimulation variants that serve as a generalization of the various (sometimes ad-hoc) notions of program equivalence found in the literature. We develop a program equivalence checking algorithm that proves two programs equivalent by reducing a proposed relation between corresponding program states to a cut-bisimulation relation. We implement this algorithm in KEQ, a new tool for checking program equivalence that accepts the operational semantics of the input and output languages as parameters, and is independent of the transformation used to generate the output. This is the first program equivalence checking tool known to the authors that is language-parametric instead of containing hard-coded language semantics as is the norm in the literature. Then, we use KEQ as the equivalence checker for two different Translation Validation systems targeting two phases of the LLVM compiler: the Instruction Selection phase and the Register Allocation phase. The two systems share the same notion of equivalence (cut-bisimulation), the same proof system (KEQ), as well as the semantic definitions for the input/output languages (LLVM IR and x86-64 based Machine IR), which are separate artifacts and not hardcoded into the logic of the systems. The only components that are transformation-specific are the two verification condition generators. The Instruction Selection one requires minimal support from the compiler in the form of compiler-generated hints, while the Register Allocation one is employing a novel inference algorithm for register allocation and related optimizations. These systems were evaluated on the GCC SPEC 2006 benchmark, where they correctly validated 4331 / 4732 (91.52%) and 4574 / 4732 (96.67%) functions with supported features respectively

    Trustworthy Refactoring via Decomposition and Schemes: A Complex Case Study

    Get PDF
    Widely used complex code refactoring tools lack a solid reasoning about the correctness of the transformations they implement, whilst interest in proven correct refactoring is ever increasing as only formal verification can provide true confidence in applying tool-automated refactoring to industrial-scale code. By using our strategic rewriting based refactoring specification language, we present the decomposition of a complex transformation into smaller steps that can be expressed as instances of refactoring schemes, then we demonstrate the semi-automatic formal verification of the components based on a theoretical understanding of the semantics of the programming language. The extensible and verifiable refactoring definitions can be executed in our interpreter built on top of a static analyser framework.Comment: In Proceedings VPT 2017, arXiv:1708.0688

    Computational reverse mathematics and foundational analysis

    Get PDF
    Reverse mathematics studies which subsystems of second order arithmetic are equivalent to key theorems of ordinary, non-set-theoretic mathematics. The main philosophical application of reverse mathematics proposed thus far is foundational analysis, which explores the limits of different foundations for mathematics in a formally precise manner. This paper gives a detailed account of the motivations and methodology of foundational analysis, which have heretofore been largely left implicit in the practice. It then shows how this account can be fruitfully applied in the evaluation of major foundational approaches by a careful examination of two case studies: a partial realization of Hilbert's program due to Simpson [1988], and predicativism in the extended form due to Feferman and Sch\"{u}tte. Shore [2010, 2013] proposes that equivalences in reverse mathematics be proved in the same way as inequivalences, namely by considering only ω\omega-models of the systems in question. Shore refers to this approach as computational reverse mathematics. This paper shows that despite some attractive features, computational reverse mathematics is inappropriate for foundational analysis, for two major reasons. Firstly, the computable entailment relation employed in computational reverse mathematics does not preserve justification for the foundational programs above. Secondly, computable entailment is a Π11\Pi^1_1 complete relation, and hence employing it commits one to theoretical resources which outstrip those available within any foundational approach that is proof-theoretically weaker than Π11-CA0\Pi^1_1\text{-}\mathsf{CA}_0.Comment: Submitted. 41 page

    Scheduler-specific Confidentiality for Multi-Threaded Programs and Its Logic-Based Verification

    Get PDF
    Observational determinism has been proposed in the literature as a way to ensure confidentiality for multi-threaded programs. Intuitively, a program is observationally deterministic if the behavior of the public variables is deterministic, i.e., independent of the private variables and the scheduling policy. Several formal definitions of observational determinism exist, but all of them have shortcomings; for example they accept insecure programs or they reject too many innocuous programs. Besides, the role of schedulers was ignored in all the proposed definitions. A program that is secure under one kind of scheduler might not be secure when executed with a different scheduler. The existing definitions do not ensure that an accepted program behaves securely under the scheduler that is used to deploy the program. Therefore, this paper proposes a new formalization of scheduler-specific observational determinism. It accepts programs that are secure when executed under a specific scheduler. Moreover, it is less restrictive on harmless programs under a particular scheduling policy. In addition, we discuss how compliance with our definition can be verified, using model checking. We use the idea of self-composition and we rephrase the observational determinism property for a single program CC as a temporal logic formula over the program CC executed in parallel with an independent copy of itself. Thus two states reachable during the execution of CC are combined into a reachable program state of the self-composed program. This allows to compare two program executions in a single temporal logic formula. The actual characterization is done in two steps. First we discuss how stuttering equivalence can be characterized as a temporal logic formula. Observational determinism is then expressed in terms of the stuttering equivalence characterization. This results in a conjunction of an LTL and a CTL formula, that are amenable to model checking

    Advanced Probabilistic Couplings for Differential Privacy

    Get PDF
    Differential privacy is a promising formal approach to data privacy, which provides a quantitative bound on the privacy cost of an algorithm that operates on sensitive information. Several tools have been developed for the formal verification of differentially private algorithms, including program logics and type systems. However, these tools do not capture fundamental techniques that have emerged in recent years, and cannot be used for reasoning about cutting-edge differentially private algorithms. Existing techniques fail to handle three broad classes of algorithms: 1) algorithms where privacy depends accuracy guarantees, 2) algorithms that are analyzed with the advanced composition theorem, which shows slower growth in the privacy cost, 3) algorithms that interactively accept adaptive inputs. We address these limitations with a new formalism extending apRHL, a relational program logic that has been used for proving differential privacy of non-interactive algorithms, and incorporating aHL, a (non-relational) program logic for accuracy properties. We illustrate our approach through a single running example, which exemplifies the three classes of algorithms and explores new variants of the Sparse Vector technique, a well-studied algorithm from the privacy literature. We implement our logic in EasyCrypt, and formally verify privacy. We also introduce a novel coupling technique called \emph{optimal subset coupling} that may be of independent interest

    Deciding KAT and Hoare Logic with Derivatives

    Get PDF
    Kleene algebra with tests (KAT) is an equational system for program verification, which is the combination of Boolean algebra (BA) and Kleene algebra (KA), the algebra of regular expressions. In particular, KAT subsumes the propositional fragment of Hoare logic (PHL) which is a formal system for the specification and verification of programs, and that is currently the base of most tools for checking program correctness. Both the equational theory of KAT and the encoding of PHL in KAT are known to be decidable. In this paper we present a new decision procedure for the equivalence of two KAT expressions based on the notion of partial derivatives. We also introduce the notion of derivative modulo particular sets of equations. With this we extend the previous procedure for deciding PHL. Some experimental results are also presented.Comment: In Proceedings GandALF 2012, arXiv:1210.202
    corecore