110 research outputs found

    Compositional Verification of Compiler Optimisations on Relaxed Memory

    Get PDF
    This paper is about verifying program transformations on an axiomatic relaxed memory model of the kind used in C/C++ and Java. Relaxed models present particular challenges for verifying program transformations, because they generate many additional modes of interaction between code and context. For a block of code being transformed, we define a denotation from its behaviour in a set of representative contexts. Our denotation summarises interactions of the code block with the rest of the program both through local and global variables, and through subtle synchronisation effects due to relaxed memory. We can then prove that a transformation does not introduce new program behaviours by comparing the denotations of the code block before and after. Our approach is compositional: by examining only representative contexts, transformations are verified for any context. It is also fully abstract, meaning any valid transformation can be verified. We cover several tricky aspects of C/C++-style memory models, including release-acquire operations, sequentially consistent fences, and non-atomics. We also define a variant of our denotation that is finite at the cost of losing full abstraction. Based on this variant, we have implemented a prototype verification tool and ap

    The problem of programming language concurrency semantics

    Get PDF
    Despite decades of research, we do not have a satisfactory concurrency semantics for any general-purpose programming language that aims to support concurrent systems code. The Java Memory Model has been shown to be unsound with respect to standard compiler optimisations, while the C/C++11 model is too weak, admitting undesirable thin-air executions. Our goal in this paper is to articulate this major open problem as clearly as is currently possible, showing how it arises from the combination of multiprocessor relaxed-memory behaviour and the desire to accommodate current compiler optimisations. We make several novel contributions that each shed some light on the problem, constraining the possible solutions and identifying new difficulties. First we give a positive result, proving in HOL4 that the existing axiomatic model for C/C++11 guarantees sequentially consistent semantics for simple race-free programs that do not use low-level atomics (DRF-SC, one of the core design goals). We then describe the thin-air problem and show that it cannot be solved, without restricting current compiler optimisations, using any per-candidate-execution condition in the style of the C/C++11 model. Thin-air executions were thought to be confined to programs using relaxed atomics, but we further show that they recur when one attempts to integrate the concurrency model with more of C, mixing atomic and nonatomic accesses, and that also breaks the DRF-SC result. We then describe a semantics based on an explicit operational construction of out-of-order execution, giving the desired behaviour for thin-air examples but exposing further difficulties with accommodating existing compiler optimisations. Finally, we show that there are major difficulties integrating concurrency semantics with the C/C++ notion of undefined behaviour. We hope thereby to stimulate and enable research on this key issue

    A New Verified Compiler Backend for CakeML

    Get PDF
    We have developed and mechanically verified a new compiler backend for CakeML. Our new compiler features a sequence of intermediate languages that allows it to incrementally compile away high-level features and enables verification at the right levels of semantic detail. In this way, it resembles mainstream (unverified) compilers for strict functional languages. The compiler supports efficient curried multi-argument functions, configurable data representations, exceptions that unwind the call stack, register allocation, and more. The compiler targets several architectures: x86-64, ARMv6, ARMv8, MIPS-64, and RISC-V. In this paper, we present the overall structure of the compiler, including its 12 intermediate languages, and explain how everything fits together. We focus particularly on the interaction between the verification of the register allocator and the garbage collector, and memory representations. The entire development has been carried out within the HOL4 theorem prover.Engineering and Physical Sciences Research Counci

    Mathematizing C++ concurrency

    Get PDF
    Shared-memory concurrency in C and C++ is pervasive in systems programming, but has long been poorly defined. This motivated an ongoing shared effort by the standards committees to specify concurrent behaviour in the next versions of both languages. They aim to provide strong guarantees for race-free programs, together with new (but subtle) relaxed-memory atomic primitives for high-performance concurrent code. However, the current draft standards, while the result of careful deliberation, are not yet clear and rigorous definitions, and harbour substantial problems in their details. In this paper we establish a mathematical (yet readable) semantics for C++ concurrency. We aim to capture the intent of the current (`Final Committee') Draft as closely as possible, but discuss changes that fix many of its problems. We prove that a proposed x86 implementation of the concurrency primitives is correct with respect to the x86-TSO model, and describe our Cppmem tool for exploring the semantics of examples, using code generated from our Isabelle/HOL definitions. Having already motivated changes to the draft standard, this work will aid discussion of any further changes, provide a correctness condition for compilers, and give a much-needed basis for analysis and verification of concurrent C and C++ programs

    On static execution-time analysis

    Get PDF
    Proving timeliness is an integral part of the verification of safety-critical real-time systems. To this end, timing analysis computes upper bounds on the execution times of programs that execute on a given hardware platform. Modern hardware platforms commonly exhibit counter-intuitive timing behaviour: a locally slower execution can lead to a faster overall execution. Such behaviour challenges efficient timing analysis. In this work, we present and discuss a hardware design, the strictly in-order pipeline, that behaves monotonically w.r.t. the progress of a program's execution. Based on monotonicity, we prove the absence of the aforementioned counter-intuitive behaviour. At least since multi-core processors have emerged, timing analysis separates concerns by analysing different aspects of the system's timing behaviour individually. In this work, we validate the underlying assumption that a timing bound can be soundly composed from individual contributions. We show that even simple processors exhibit counter-intuitive behaviour - a locally slow execution can lead to an even slower overall execution - that impedes the soundness of the composition. We present the compositional base bound analysis that accounts for any such amplifying effects within its timing contribution. This enables a sound compositional analysis even for complex processors. Furthermore, we discuss hardware modifications that enable efficient compositional analyses.Echtzeitsysteme müssen unter allen Umständen beweisbar pünktlich arbeiten. Zum Beweis errechnet die Zeitanalyse obere Schranken der für die Ausführung von Programmen auf einer Hardware-Plattform benötigten Zeit. Moderne Hardware-Plattformen sind bekannt für unerwartetes Zeitverhalten bei dem eine lokale Verzögerung in einer global schnelleren Ausführung resultiert. Solches Zeitverhalten erschwert eine effiziente Analyse. Im Rahmen dieser Arbeit diskutieren wir das Design eines Prozessors mit eingeschränkter Fließbandverarbeitung (strictly in-order pipeline), der sich bzgl. des Fortschritts einer Programmausführung monoton verhält. Wir beweisen, dass Monotonie das oben genannte unerwartete Zeitverhalten verhindert. Spätestens seit dem Einsatz von Mehrkernprozessoren besteht die Zeitanalyse aus einzelnen Teilanalysen welche nur bestimmte Aspekte des Zeitverhaltens betrachten. Eine zentrale Annahme ist hierbei, dass sich die Teilergebnisse zu einer korrekten Zeitschranke zusammensetzen lassen. Im Rahmen dieser Arbeit zeigen wir, dass diese Annahme selbst für einfache Prozessoren ungültig ist, da eine lokale Verzögerung zu einer noch größeren globalen Verzögerung führen kann. Für bestehende Prozessoren entwickeln wir eine neuartige Teilanalyse, die solche verstärkenden Effekte berücksichtigt und somit eine korrekte Komposition von Teilergebnissen erlaubt. Für zukünftige Prozessoren beschreiben wir Modifikationen, die eine deutlich effizientere Zeitanalyse ermöglichen

    Proceedings of SUMo and CompoNet 2011

    Get PDF
    International audienc

    Certifying cost annotations in compilers

    Get PDF
    We discuss the problem of building a compiler which can lift in a provably correct way pieces of information on the execution cost of the object code to cost annotations on the source code. To this end, we need a clear and flexible picture of: (i) the meaning of cost annotations, (ii) the method to prove them sound and precise, and (iii) the way such proofs can be composed. We propose a so-called labelling approach to these three questions. As a first step, we examine its application to a toy compiler. This formal study suggests that the labelling approach has good compositionality and scalability properties. In order to provide further evidence for this claim, we report our successful experience in implementing and testing the labelling approach on top of a prototype compiler written in OCAML for (a large fragment of) the C language

    The Verified CakeML Compiler Backend

    Get PDF
    The CakeML compiler is, to the best of our knowledge, the most realistic veri?ed compiler for a functional programming language to date. The architecture of the compiler, a sequence of intermediate languages through which high-level features are compiled away incrementally, enables veri?cation of each compilation pass at inappropriate level of semantic detail.Partsofthecompiler’s implementation resemble mainstream (unveri?ed) compilers for strict functional languages, and it support several important features and optimisations. These include ef?cient curried multi-argument functions, con?gurable data representations, ef?cient exceptions, register allocation,and more. The compiler produces machine code for ?ve architectures: x86-64, ARMv6, ARMv8, MIPS-64, and RISC-V. The generatedmachine code contains the veri?edruntime system which includes averi?ed generational copying garbage collect or and averi?edarbitraryprecisionarithmetic(bignum)library. In this paper we present the overall design of the compiler backend, including its 12 intermediate languages. We explain how the semantics and proofs ?t together, and provide detail on how the compiler has been bootstrapped inside the logic of a theorem prover. The entire development has been carried out within the HOL4 theorem prover
    • …
    corecore