139 research outputs found

    Development of a Translator from LLVM to ACL2

    Full text link
    In our current work a library of formally verified software components is to be created, and assembled, using the Low-Level Virtual Machine (LLVM) intermediate form, into subsystems whose top-level assurance relies on the assurance of the individual components. We have thus undertaken a project to build a translator from LLVM to the applicative subset of Common Lisp accepted by the ACL2 theorem prover. Our translator produces executable ACL2 formal models, allowing us to both prove theorems about the translated models as well as validate those models by testing. The resulting models can be translated and certified without user intervention, even for code with loops, thanks to the use of the def::ung macro which allows us to defer the question of termination. Initial measurements of concrete execution for translated LLVM functions indicate that performance is nearly 2.4 million LLVM instructions per second on a typical laptop computer. In this paper we overview the translation process and illustrate the translator's capabilities by way of a concrete example, including both a functional correctness theorem as well as a validation test for that example.Comment: In Proceedings ACL2 2014, arXiv:1406.123

    Formalizing the SSA-based Compiler for Verified Advanced Program Transformations

    Get PDF
    Compilers are not always correct due to the complexity of language semantics and transformation algorithms, the trade-offs between compilation speed and verifiability,etc.The bugs of compilers can undermine the source-level verification efforts (such as type systems, static analysis, and formal proofs) and produce target programs with different meaning from source programs. Researchers have used mechanized proof tools to implement verified compilers that are guaranteed to preserve program semantics and proved to be more robust than ad-hoc non-verified compilers. The goal of the dissertation is to make a step towards verifying an industrial strength modern compiler--LLVM, which has a typed, SSA-based, and general-purpose intermediate representation, therefore allowing more advanced program transformations than existing approaches. The dissertation formally defines the sequential semantics of the LLVM intermediate representation with its type system, SSA properties, memory model, and operational semantics. To design and reason about program transformations in the LLVM IR, we provide tools for interacting with the LLVM infrastructure and metatheory for SSA properties, memory safety, dynamic semantics, and control-flow-graphs. Based on the tools and metatheory, the dissertation implements verified and extractable applications for LLVM that include an interpreter for the LLVM IR, a transformation for enforcing memory safety, translation validators for local optimizations, and verified SSA construction transformation. This dissertation shows that formal models of SSA-based compiler intermediate representations can be used to verify low-level program transformations, thereby enabling the construction of high-assurance compiler passes

    A Formal Semantics of the GraalVM Intermediate Representation

    Full text link
    The optimization phase of a compiler is responsible for transforming an intermediate representation (IR) of a program into a more efficient form. Modern optimizers, such as that used in the GraalVM compiler, use an IR consisting of a sophisticated graph data structure that combines data flow and control flow into the one structure. As part of a wider project on the verification of optimization passes of GraalVM, this paper describes a semantics for its IR within Isabelle/HOL. The semantics consists of a big-step operational semantics for data nodes (which are represented in a graph-based static single assignment (SSA) form) and a small-step operational semantics for handling control flow including heap-based reads and writes, exceptions, and method calls. We have proved a suite of canonicalization optimizations and conditional elimination optimizations with respect to the semantics.Comment: 16 pages, 8 figures, to be published to ATVA 202

    Generating Verified LLVM from Isabelle/HOL

    Get PDF
    We present a framework to generate verified LLVM programs from Isabelle/HOL. It is based on a code generator that generates LLVM text from a simplified fragment of LLVM, shallowly embedded into Isabelle/HOL. On top, we have developed a separation logic, a verification condition generator, and an LLVM backend to the Isabelle Refinement Framework. As case studies, we have produced verified LLVM implementations of binary search and the Knuth-Morris-Pratt string search algorithm. These are one order of magnitude faster than the Standard-ML implementations produced with the original Refinement Framework, and on par with unverified C implementations. Adoption of the original correctness proofs to the new LLVM backend was straightforward. The trusted code base of our approach is the shallow embedding of the LLVM fragment and the code generator, which is a pretty printer combined with some straightforward compilation steps

    Static analysis of energy consumption for LLVM IR programs

    Get PDF
    Energy models can be constructed by characterizing the energy consumed by executing each instruction in a processor's instruction set. This can be used to determine how much energy is required to execute a sequence of assembly instructions, without the need to instrument or measure hardware. However, statically analyzing low-level program structures is hard, and the gap between the high-level program structure and the low-level energy models needs to be bridged. We have developed techniques for performing a static analysis on the intermediate compiler representations of a program. Specifically, we target LLVM IR, a representation used by modern compilers, including Clang. Using these techniques we can automatically infer an estimate of the energy consumed when running a function under different platforms, using different compilers. One of the challenges in doing so is that of determining an energy cost of executing LLVM IR program segments, for which we have developed two different approaches. When this information is used in conjunction with our analysis, we are able to infer energy formulae that characterize the energy consumption for a particular program. This approach can be applied to any languages targeting the LLVM toolchain, including C and XC or architectures such as ARM Cortex-M or XMOS xCORE, with a focus towards embedded platforms. Our techniques are validated on these platforms by comparing the static analysis results to the physical measurements taken from the hardware. Static energy consumption estimation enables energy-aware software development, without requiring hardware knowledge

    Lasagne : a static binary translator for weak memory model architectures

    Get PDF
    Funding: This work was supported by a UK RISE Grant.The emergence of new architectures create a recurring challenge to ensure that existing programs still work on them. Manually porting legacy code is often impractical. Static binary translation (SBT) is a process where a program’s binary is automatically translated from one architecture to another, while preserving their original semantics. However, these SBT tools have limited support to various advanced architectural features. Importantly, they are currently unable to translate concurrent binaries. The main challenge arises from the mismatches of the memory consistency model specified by the different architectures, especially when porting existing binaries to a weak memory model architecture. In this paper, we propose Lasagne, an end-to-end static binary translator with precise translation rules between x86 and Arm concurrency semantics. First, we propose a concurrency model for Lasagne’s intermediate representation (IR) and formally proved mappings between the IR and the two architectures. The memory ordering is preserved by introducing fences in the translated code. Finally, we propose optimizations focused on raising the level of abstraction of memory address calculations and reducing the number offences. Our evaluation shows that Lasagne reduces the number of fences by up to about 65%, with an average reduction of 45.5%, significantly reducing their runtime overhead.Postprin

    A Linear First-Order Functional Intermediate Language for Verified Compilers

    Full text link
    We present the linear first-order intermediate language IL for verified compilers. IL is a functional language with calls to a nondeterministic environment. We give IL terms a second, imperative semantic interpretation and obtain a register transfer language. For the imperative interpretation we establish a notion of live variables. Based on live variables, we formulate a decidable property called coherence ensuring that the functional and the imperative interpretation of a term coincide. We formulate a register assignment algorithm for IL and prove its correctness. The algorithm translates a functional IL program into an equivalent imperative IL program. Correctness follows from the fact that the algorithm reaches a coherent program after consistently renaming local variables. We prove that the maximal number of live variables in the initial program bounds the number of different variables in the final coherent program. The entire development is formalized in Coq.Comment: Addressed comments from reviewers (ITP 2015): (1) Added discussion of a paper in related work (2) Added definition of renamed-apart in appendix (3) Formulation changes in a coupe of place
    • …
    corecore