Search CORE

90 research outputs found

The problem of programming language concurrency semantics

Author: G. Boudol
J. Ševčík
K. Gharachorloo
L. Lamport
P. Cenciarelli
R. Jagadeesan
Publication venue: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Publication date: 01/01/2015
Field of study

Despite decades of research, we do not have a satisfactory concurrency semantics for any general-purpose programming language that aims to support concurrent systems code. The Java Memory Model has been shown to be unsound with respect to standard compiler optimisations, while the C/C++11 model is too weak, admitting undesirable thin-air executions. Our goal in this paper is to articulate this major open problem as clearly as is currently possible, showing how it arises from the combination of multiprocessor relaxed-memory behaviour and the desire to accommodate current compiler optimisations. We make several novel contributions that each shed some light on the problem, constraining the possible solutions and identifying new difficulties. First we give a positive result, proving in HOL4 that the existing axiomatic model for C/C++11 guarantees sequentially consistent semantics for simple race-free programs that do not use low-level atomics (DRF-SC, one of the core design goals). We then describe the thin-air problem and show that it cannot be solved, without restricting current compiler optimisations, using any per-candidate-execution condition in the style of the C/C++11 model. Thin-air executions were thought to be confined to programs using relaxed atomics, but we further show that they recur when one attempts to integrate the concurrency model with more of C, mixing atomic and nonatomic accesses, and that also breaks the DRF-SC result. We then describe a semantics based on an explicit operational construction of out-of-order execution, giving the desired behaviour for thin-air examples but exposing further difficulties with accommodating existing compiler optimisations. Finally, we show that there are major difficulties integrating concurrency semantics with the C/C++ notion of undefined behaviour. We hope thereby to stimulate and enable research on this key issue

Crossref

Kent Academic Repository

Apollo (Cambridge)

A concurrency semantics for relaxed atomics that permits optimisation and avoids thin-air executions

Author: Pichon-Pharabod J
Sewell P
Publication venue: ACM SIGPLAN Notices
Publication date: 08/04/2016
Field of study

Copyright is held by the owner/author(s). Despite much research on concurrent programming languages, especially for Java and C/C++, we still do not have a satisfactory definition of their semantics, one that admits all common optimisations without also admitting undesired behaviour. Especially problematic are the "thin-Air" examples involving high-performance concurrent accesses, such as C/C++11 relaxed atomics. The C/C++11 model is in a per-candidate-execution style, and previous work has identified a tension between that and the fact that compiler optimisations do not operate over single candidate executions in isolation; rather, they operate over syntactic representations that represent all executions. In this paper we propose a novel approach that circumvents this difficulty. We define a concurrency semantics for a core calculus, including relaxed-Atomic and non-Atomic accesses, and locks, that admits a wide range of optimisation while still forbidding the classic thin-Air examples. It also addresses other problems relating to undefined behaviour. The basic idea is to use an event-structure representation of the current state of each thread, capturing all of its potential executions, and to permit interleaving of execution and transformation steps over that to reflect optimisation (possibly dynamic) of the code. These are combined with a non-multi-copy-Atomic storage subsystem, to reflect common hardware behaviour. The semantics is defined in a mechanised and executable form, and designed to be implementable above current relaxed hardware and strong enough to support the programming idioms that C/C++11 does for this fragment. It offers a potential way forward for concurrent programming language semantics, beyond the current C/C++11 and Java models.This work was partly funded by the EPSRC Programme Grant REMS: Rigorous Engineering for Mainstream Systems, EP/K008528/

Crossref

Apollo (Cambridge)

TriCheck: Memory Model Verification at the Trisection of Software, Hardware, and ISA

Author: Collier William W.
Gharachorloo Kourosh
International SPARC
J.
Jackson Daniel
Petri Gustavo
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

Memory consistency models (MCMs) which govern inter-module interactions in a shared memory system, are a significant, yet often under-appreciated, aspect of system design. MCMs are defined at the various layers of the hardware-software stack, requiring thoroughly verified specifications, compilers, and implementations at the interfaces between layers. Current verification techniques evaluate segments of the system stack in isolation, such as proving compiler mappings from a high-level language (HLL) to an ISA or proving validity of a microarchitectural implementation of an ISA. This paper makes a case for full-stack MCM verification and provides a toolflow, TriCheck, capable of verifying that the HLL, compiler, ISA, and implementation collectively uphold MCM requirements. The work showcases TriCheck's ability to evaluate a proposed ISA MCM in order to ensure that each layer and each mapping is correct and complete. Specifically, we apply TriCheck to the open source RISC-V ISA, seeking to verify accurate, efficient, and legal compilations from C11. We uncover under-specifications and potential inefficiencies in the current RISC-V ISA documentation and identify possible solutions for each. As an example, we find that a RISC-V-compliant microarchitecture allows 144 outcomes forbidden by C11 to be observed out of 1,701 litmus tests examined. Overall, this paper demonstrates the necessity of full-stack verification for detecting MCM-related bugs in the hardware-software stack.Comment: Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating System

arXiv.org e-Print Archive

Princeton University Open Access Repository

Crossref

Overhauling SC atomics in C11 and OpenCL

Author: Alastair F. Donaldson
IEC.
IEC.
IEC.
John Wickerson
Mark Batty
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 05/10/2015
Field of study

Despite the conceptual simplicity of sequential consistency (SC), the semantics of SC atomic operations and fences in the C11 and OpenCL memory models is subtle, leading to convoluted prose descriptions that translate to complex axiomatic formalisations. We conduct an overhaul of SC atomics in C11, reducing the associated axioms in both number and complexity. A consequence of our simplification is that the SC operations in an execution no longer need to be totally ordered. This relaxation enables, for the first time, efficient and exhaustive simulation of litmus tests that use SC atomics. We extend our improved C11 model to obtain the first rigorous memory model formalisation for OpenCL (which extends C11 with support for heterogeneous many-core programming). In the OpenCL setting, we refine the SC axioms still further to give a sensible semantics to SC operations that employ a ‘memory scope’ to restrict their visibility to specific threads. Our overhaul requires slight strengthenings of both the C11 and the OpenCL memory models, causing some behaviours to become disallowed. We argue that these strengthenings are natural, and that all of the formalised C11 and OpenCL compilation schemes of which we are aware (Power and x86 CPUs for C11, AMD GPUs for OpenCL) remain valid in our revised models. Using the HERD memory model simulator, we show that our overhaul leads to an exponential improvement in simulation time for C11 litmus tests compared with the original model, making exhaustive simulation competitive, time-wise, with the non-exhaustive CDSChecker tool

arXiv.org e-Print Archive

Crossref

Kent Academic Repository

Spiral - Imperial College Digital Repository

Compositional Verification of Compiler Optimisations on Relaxed Memory

Author: D Distefano
D Jackson
Daniel Poetzl
J Alglave
J Ševčík
J Ševčík
M Batty
MP Herlihy
S Brookes
Viktor Vafeiadis
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 16/02/2018
Field of study

This paper is about verifying program transformations on an axiomatic relaxed memory model of the kind used in C/C++ and Java. Relaxed models present particular challenges for verifying program transformations, because they generate many additional modes of interaction between code and context. For a block of code being transformed, we define a denotation from its behaviour in a set of representative contexts. Our denotation summarises interactions of the code block with the rest of the program both through local and global variables, and through subtle synchronisation effects due to relaxed memory. We can then prove that a transformation does not introduce new program behaviours by comparing the denotations of the code block before and after. Our approach is compositional: by examining only representative contexts, transformations are verified for any context. It is also fully abstract, meaning any valid transformation can be verified. We cover several tricky aspects of C/C++-style memory models, including release-acquire operations, sequentially consistent fences, and non-atomics. We also define a variant of our denotation that is finite at the cost of losing full abstraction. Based on this variant, we have implemented a prototype verification tool and ap

arXiv.org e-Print Archive

Crossref

Kent Academic Repository

Automatically Comparing Memory Consistency Models

Author: Abrial Jean-Raymond
Adve Sarita V.
Aho Alfred V.
Alglave Jade
Audemard Gilles
Biere Armin
Bornholt James
Collier William W.
Eén Niklas
Gharachorloo Kourosh
Higham Lisa
IEC.
Jackson Daniel
John
Lahav Ori
Möhring Rolf H.
Sezgin Ali
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/10/2016
Field of study

A memory consistency model (MCM) is the part of a programming language or computer architecture specification that defines which values can legally be read from shared memory locations. Because MCMs take into account various optimisations employed by archi- tectures and compilers, they are often complex and counterintu- itive, which makes them challenging to design and to understand. We identify four tasks involved in designing and understanding MCMs: generating conformance tests, distinguishing two MCMs, checking compiler optimisations, and checking compiler mappings. We show that all four tasks are instances of a general constraint-satisfaction problem to which the solution is either a program or a pair of programs. Although this problem is intractable for automatic solvers when phrased over programs directly, we show how to solve analogous constraints over program executions, and then construct programs that satisfy the original constraints. Our technique, which is implemented in the Alloy modelling framework, is illustrated on several software- and architecture-level MCMs, both axiomatically and operationally defined. We automatically recreate several known results, often in a simpler form, including: distinctions between variants of the C11 MCM; a failure of the ‘SC-DRF guarantee’ in an early C11 draft; that x86 is ‘multi-copy atomic’ and Power is not; bugs in common C11 compiler optimisations; and bugs in a compiler mapping from OpenCL to AMD-style GPUs. We also use our technique to develop and validate a new MCM for NVIDIA GPUs that supports a natural mapping from OpenCL

Crossref

Kent Academic Repository

Spiral - Imperial College Digital Repository

Recommended from our members

A no-thin-air memory model for programming languages

Author: Pichon-Pharabod Jean Yves Alexis
Publication venue: University of Cambridge
Publication date: 28/03/2018
Field of study

Many hardware and compiler optimisations introduced to speed up single-threaded programs also introduce additional, sometimes surprising, behaviours for concurrent programs with shared mutable state. How many of these extra behaviours occur in practice depends on the combination of the hardware, compiler, runtime, etc. that make up the platform. A memory model, which prescribes what values each read of a concurrent program can read, allows programmers to determine whether a program behaves as expected without having to worry about the details of the platform. However, capturing these behaviours in a memory model without also including undesirable "out-of-thin-air" behaviours that do not occur in practice has proved elusive. The memory model of C and C++ allows out-of-thin-air behaviour, while the Java memory model fails to capture some behaviours that are introduced in practice by compiler optimisations. In this thesis, we propose a memory model that forbids out-of-thin-air behaviour, yet allows the behaviours that do occur. Our memory model follows operational intuitions of how the hardware and compilers operate. We illustrate that it behaves as desired on a series of litmus tests. We show that it captures at least some of the expected behaviours, that it forms an envelope around some common compiler optimisations, and that it is implementable on common hardware using the expected compilation schemes. We also show that it supports some established programming idioms.EPSRC Programme Grant REMS: Rigorous Engineering for Mainstream Systems, EP/K008628/1 Computer Laboratory Premium Research Studentshi

Apollo (Cambridge)

Dynamic race detection for C++11

Author: Donaldson AF
Lidbury C
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/10/2016
Field of study

The intricate rules for memory ordering and synchronisation associated with the C/C++11 memory model mean that data races can be difficult to eliminate from concurrent programs. Dynamic data race analysis can pinpoint races in large and complex applications, but the state-of-the-art ThreadSanitizer (tsan) tool for C/C++ considers only sequentially consistent program executions, and does not correctly model synchronisation between C/C++11 atomic operations. We present a scalable dynamic data race analysis for C/C++11 that correctly captures C/C++11 synchronisation, and uses instrumentation to support exploration of a class of non sequentially consistent executions. We concisely define the memory model fragment captured by our instrumentation via a restricted axiomatic semantics, and show that the axiomatic semantics permits exactly those executions explored by our instrumentation. We have implemented our analysis in tsan, and evaluate its effectiveness on benchmark programs, enabling a comparison with the CDSChecker tool, and on two large and highly concurrent applications: the Firefox and Chromium web browsers. Our results show that our method can detect races that are beyond the scope of the original tsan tool, and that the overhead associated with applying our enhanced instrumentation to large applications is tolerable

Spiral - Imperial College Digital Repository

Into the depths of C: Elaborating the de facto standards

Author: Chisnall D
Lingard J
Matthiesen J
Memarian K
Nienhuis K
Sewell P
Watson RNM
Publication venue: ACM SIGPLAN Notices
Publication date: 01/01/2016
Field of study

C remains central to our computing infrastructure. It is notionally defined by ISO standards, but in reality the properties of C assumed by systems code and those implemented by compilers have diverged, both from the ISO standards and from each other, and none of these are clearly understood. We make two contributions to help improve this error-prone situation. First, we describe an in-depth analysis of the design space for the semantics of pointers and memory in C as it is used in practice. We articulate many specific questions, build a suite of semantic test cases, gather experimental data from multiple implementations, and survey what C experts believe about the de facto standards. We identify questions where there is a consensus (either following ISO or differing) and where there are conflicts. We apply all this to an experimental C implemented above capability hardware. Second, we describe a formal model, Cerberus, for large parts of C. Cerberus is parameterised on its memory model; it is linkable either with a candidate de facto memory object model, under construction, or with an operational C11 concurrency model; it is defined by elaboration to a much simpler Core language for accessibility, and it is executable as a test oracle on small examples. This should provide a solid basis for discussion of what mainstream C is now: what programmers and analysis tools can assume and what compilers aim to implement. Ultimately we hope it will be a step towards clear, consistent, and accepted semantics for the various use-cases of C.We acknowledge funding from EPSRC grants EP/H005633 (Leadership Fellowship, Sewell) and EP/K008528 (REMS Programme Grant), and a Gates Cambridge Scholarship (Nienhuis). This work is also part of the CTSRD projects sponsored by the Defense Advanced Research Projects Agency (DARPA) and the Air Force Research Laboratory (AFRL), under contract FA8750-10-C-0237.This is the author accepted manuscript. The final version is available from the Association for Computing Machinery via http://dx.doi.org/10.1145/2908080.290808

Crossref

Apollo (Cambridge)

Recommended from our members

Weakening WebAssembly

Author: Pichon-Pharabod Jean
Rossberg Andreas
Watt Conrad
Publication venue: Proceedings of the ACM on Programming Languages
Publication date: 10/10/2019
Field of study

WebAssembly (Wasm) is a safe, portable virtual instruction set that can be hosted in a wide range of environments, such as a Web browser. It is a low-level language whose instructions are intended to compile directly to bare hardware. While the initial version of Wasm focussed on single-threaded computation, a recent proposal extends it with low-level support for multiple threads and atomic instructions for synchronised access to shared memory. To support the correct compilation of concurrent programs, it is necessary to give a suitable specification of its memory model. Wasm’s language definition is based on a fully formalised specification that carefully avoids undefined behaviour. We present a substantial extension to this semantics, incorporating a relaxed memory model, along with a few proposed operational extensions. Wasm’s memory model is unique in that its linear address space can be dynamically grown during execution, while all accesses are bounds-checked. This leads to the novel problem of specifying how observations about the size of the memory can propagate between threads. We argue that, considering desirable compilation schemes, we cannot give a sequentially consistent semantics to memory growth. We show that our model guarantees Sequential Consistency of Data-Race-Free programs (SC-DRF). However, because Wasm is to run on the Web, we must also consider interoperability of its model with that of JavaScript. We show, by counter-example, that JavaScript’s memory model is not SC-DRF, in contrast to what is claimed in its specification. We propose two axiomatic conditions that should be added to the JavaScript model to correct this difference. We also describe a prototype SMT-based litmus tool which acts as an oracle for our axiomatic model, visualising its behaviours, including memory resizing

Apollo (Cambridge)