75 research outputs found

    A New Verified Compiler Backend for CakeML

    Get PDF
    We have developed and mechanically verified a new compiler backend for CakeML. Our new compiler features a sequence of intermediate languages that allows it to incrementally compile away high-level features and enables verification at the right levels of semantic detail. In this way, it resembles mainstream (unverified) compilers for strict functional languages. The compiler supports efficient curried multi-argument functions, configurable data representations, exceptions that unwind the call stack, register allocation, and more. The compiler targets several architectures: x86-64, ARMv6, ARMv8, MIPS-64, and RISC-V. In this paper, we present the overall structure of the compiler, including its 12 intermediate languages, and explain how everything fits together. We focus particularly on the interaction between the verification of the register allocator and the garbage collector, and memory representations. The entire development has been carried out within the HOL4 theorem prover.Engineering and Physical Sciences Research Counci

    Cakes That Bake Cakes: Dynamic Computation in CakeML

    Get PDF
    We have extended the verified CakeML compiler with a new language primitive, Eval, which permits evaluation of new CakeML syntax at runtime. This new implementation supports an ambitious form of compilation at runtime and dynamic execution, where the original and dynamically added code can share (higher-order) values and recursively call each other. This is, to our knowledge, the first verified run-Time environment capable of supporting a standard LCF-style theorem prover design. Modifying the modern CakeML compiler pipeline and proofs to support a dynamic computation semantics was an extensive project. We review the design decisions, proof techniques, and proof engineering lessons from the project, and highlight some unexpected complications

    A Verified Certificate Checker for Finite-Precision Error Bounds in Coq and HOL4

    Full text link
    Being able to soundly estimate roundoff errors of finite-precision computations is important for many applications in embedded systems and scientific computing. Due to the discrepancy between continuous reals and discrete finite-precision values, automated static analysis tools are highly valuable to estimate roundoff errors. The results, however, are only as correct as the implementations of the static analysis tools. This paper presents a formally verified and modular tool which fully automatically checks the correctness of finite-precision roundoff error bounds encoded in a certificate. We present implementations of certificate generation and checking for both Coq and HOL4 and evaluate it on a number of examples from the literature. The experiments use both in-logic evaluation of Coq and HOL4, and execution of extracted code outside of the logics: we benchmark Coq extracted unverified OCaml code and a CakeML-generated verified binary

    Verified compilation and optimization of floating-point kernels

    Get PDF
    When verifying safety-critical code on the level of source code, we trust the compiler to produce machine code that preserves the behavior of the source code. Trusting a verified compiler is easy. A rigorous machine-checked proof shows that the compiler correctly translates source code into machine code. Modern verified compilers (e.g. CompCert and CakeML) have rich input languages, but only rudimentary support for floating-point arithmetic. In fact, state-of-the-art verified compilers only implement and verify an inflexible one-to-one translation from floating-point source code to machine code. This translation completely ignores that floating-point arithmetic is actually a discrete representation of the continuous real numbers. This thesis presents two extensions improving floating-point arithmetic in CakeML. First, the thesis demonstrates verified compilation of elementary functions to floating-point code in: Dandelion, an automatic verifier for polynomial approximations of elementary functions; and libmGen, a proof-producing compiler relating floating-point machine code to the implemented real-numbered elementary function. Second, the thesis demonstrates verified optimization of floating-point code in: Icing, a floating-point language extending standard floating-point arithmetic with optimizations similar to those used by unverified compilers, like GCC and LLVM; and RealCake, an extension of CakeML with Icing into the first fully verified optimizing compiler for floating-point arithmetic.Bei der Verifizierung von sicherheitsrelevantem Quellcode vertrauen wir dem Compiler, dass er Maschinencode ausgibt, der sich wie der Quellcode verhält. Man kann ohne weiteres einem verifizierten Compiler vertrauen. Ein rigoroser maschinen-ü}berprüfter Beweis zeigt, dass der Compiler Quellcode in korrekten Maschinencode übersetzt. Moderne verifizierte Compiler (z.B. CompCert und CakeML) haben komplizierte Eingabesprachen, aber unterstützen Gleitkommaarithmetik nur rudimentär. De facto implementieren und verifizieren hochmoderne verifizierte Compiler für Gleitkommaarithmetik nur eine starre eins-zu-eins Übersetzung von Quell- zu Maschinencode. Diese Übersetzung ignoriert vollständig, dass Gleitkommaarithmetik eigentlich eine diskrete Repräsentation der kontinuierlichen reellen Zahlen ist. Diese Dissertation präsentiert zwei Erweiterungen die Gleitkommaarithmetik in CakeML verbessern. Zuerst demonstriert die Dissertation verifizierte Übersetzung von elementaren Funktionen in Gleitkommacode mit: Dandelion, einem automatischen Verifizierer für Polynomapproximierungen von elementaren Funktionen; und libmGen, einen Beweis-erzeugenden Compiler der Gleitkommacode in Relation mit der implementierten elementaren Funktion setzt. Dann demonstriert die Dissertation verifizierte Optimierung von Gleitkommacode mit: Icing, einer Gleitkommasprache die Gleitkommaarithmetik mit Optimierungen erweitert die ähnlich zu denen in unverifizierten Compilern, wie GCC und LLVM, sind; und RealCake, eine Erweiterung von CakeML mit Icing als der erste vollverifizierte Compiler für Gleitkommaarithmetik

    The Verified CakeML Compiler Backend

    Get PDF
    The CakeML compiler is, to the best of our knowledge, the most realistic veri?ed compiler for a functional programming language to date. The architecture of the compiler, a sequence of intermediate languages through which high-level features are compiled away incrementally, enables veri?cation of each compilation pass at inappropriate level of semantic detail.Partsofthecompiler’s implementation resemble mainstream (unveri?ed) compilers for strict functional languages, and it support several important features and optimisations. These include ef?cient curried multi-argument functions, con?gurable data representations, ef?cient exceptions, register allocation,and more. The compiler produces machine code for ?ve architectures: x86-64, ARMv6, ARMv8, MIPS-64, and RISC-V. The generatedmachine code contains the veri?edruntime system which includes averi?ed generational copying garbage collect or and averi?edarbitraryprecisionarithmetic(bignum)library. In this paper we present the overall design of the compiler backend, including its 12 intermediate languages. We explain how the semantics and proofs ?t together, and provide detail on how the compiler has been bootstrapped inside the logic of a theorem prover. The entire development has been carried out within the HOL4 theorem prover

    Characteristic Formulae for Liveness Properties of Non-Terminating CakeML Programs

    Get PDF
    There are useful programs that do not terminate, and yet standard Hoare logics are not able to prove liveness properties about non-terminating programs. This paper shows how a Hoare-like programming logic framework (characteristic formulae) can be extended to enable reasoning about the I/O behaviour of programs that do not terminate. The approach is inspired by transfinite induction rather than coinduction, and does not require non-terminating loops to be productive. This work has been developed in the HOL4 theorem prover and has been integrated into the ecosystem of proof tools surrounding the CakeML programming language

    Choreographies and Cost Semantics for Reliable Communicating Systems

    Get PDF
    Communicating systems have become ubiquitous in today\u27s society.Unfortunately, the complexity of their interactions makes themparticularly prone to failures such as deadlocked states causedby misbehaving components, or memory exhaustion due to a surge inmessage traffic (malicious or not). These vulnerabilitiesconstitute a real risk to users, with consequences ranging fromminor inconveniences to the possibility of loss of life andcapital. This thesis presents two results that aim to increasethe reliability of communicating systems. First, we implement achoreography language which by construction can only describesystems that are deadlock-free. Second, we develop a costsemantics to prove programs free of out-of-memory errors. Both ofthese results are formalized in the HOL4 theorem prover andintegrated with the CakeML verified stack

    A verified generational garbage collector for CakeML

    Get PDF
    This paper presents the verification of a generational copying garbage collector for the CakeML runtime system. The proof is split into an algorithm proof and an implementation proof. The algorithm proof follows the structure of the informal intuition for the generational collector’s correctness, namely, a partial collection cycle in a generational collector is the same as running a full collection on part of the heap, if one views pointers to old data as non-pointers. We present a pragmatic way of dealing with ML-style mutable state, such as references and arrays, in the proofs. The development has been fully integrated into the in-logic bootstrapped CakeML compiler, which now includes command-line arguments that allow configuration of the generational collector. All proofs were carried out in the HOL4 theorem prover

    Do you have space for dessert? a verified space cost semantics for CakeML programs

    Get PDF
    Garbage collectors relieve the programmer from manual memory management, but lead to compiler-generated machine code that can behave differently (e.g. out-of-memory errors) from the source code. To ensure that the generated code behaves exactly like the source code, programmers need a way to answer questions of the form: what is a sufficient amount of memory for my program to never reach an out-of-memory error? This paper develops a cost semantics that can answer such questions for CakeML programs. The work described in this paper is the first to be able to answer such questions with proofs in the context of a language that depends on garbage collection. We demonstrate that positive answers can be used to transfer liveness results proved for the source code to liveness guarantees about the generated machine code. Without guarantees about space usage, only safety results can be transferred from source to machine code. Our cost semantics is phrased in terms of an abstract intermediate language of the CakeML compiler, but results proved at that level map directly to the space cost of the compiler-generated machine code. All of the work described in this paper has been developed in the HOL4 theorem prover
    • …
    corecore