120 research outputs found

    Mathematizing C++ concurrency

    Get PDF
    Shared-memory concurrency in C and C++ is pervasive in systems programming, but has long been poorly defined. This motivated an ongoing shared effort by the standards committees to specify concurrent behaviour in the next versions of both languages. They aim to provide strong guarantees for race-free programs, together with new (but subtle) relaxed-memory atomic primitives for high-performance concurrent code. However, the current draft standards, while the result of careful deliberation, are not yet clear and rigorous definitions, and harbour substantial problems in their details. In this paper we establish a mathematical (yet readable) semantics for C++ concurrency. We aim to capture the intent of the current (`Final Committee') Draft as closely as possible, but discuss changes that fix many of its problems. We prove that a proposed x86 implementation of the concurrency primitives is correct with respect to the x86-TSO model, and describe our Cppmem tool for exploring the semantics of examples, using code generated from our Isabelle/HOL definitions. Having already motivated changes to the draft standard, this work will aid discussion of any further changes, provide a correctness condition for compilers, and give a much-needed basis for analysis and verification of concurrent C and C++ programs

    Robustness against Power is PSPACE-complete

    Full text link
    Power is a RISC architecture developed by IBM, Freescale, and several other companies and implemented in a series of POWER processors. The architecture features a relaxed memory model providing very weak guarantees with respect to the ordering and atomicity of memory accesses. Due to these weaknesses, some programs that are correct under sequential consistency (SC) show undesirable effects when run under Power. We call these programs not robust against the Power memory model. Formally, a program is robust if every computation under Power has the same data and control dependencies as some SC computation. Our contribution is a decision procedure for robustness of concurrent programs against the Power memory model. It is based on three ideas. First, we reformulate robustness in terms of the acyclicity of a happens-before relation. Second, we prove that among the computations with cyclic happens-before relation there is one in a certain normal form. Finally, we reduce the existence of such a normal-form computation to a language emptiness problem. Altogether, this yields a PSPACE algorithm for checking robustness against Power. We complement it by a matching lower bound to show PSPACE-completeness

    The Benefits of Duality in Verifying Concurrent Programs under TSO

    Get PDF
    We address the problem of verifying safety properties of concurrent programs running over the TSO memory model. Known decision procedures for this model are based on complex encodings of store buffers as lossy channels. These procedures assume that the number of processes is fixed. However, it is important in general to prove correctness of a system/algorithm in a parametric way with an arbitrarily large number of processes. In this paper, we introduce an alternative (yet equivalent) semantics to the classical one for the TSO model that is more amenable for efficient algorithmic verification and for extension to parametric verification. For that, we adopt a dual view where load buffers are used instead of store buffers. The flow of information is now from the memory to load buffers. We show that this new semantics allows (1) to simplify drastically the safety analysis under TSO, (2) to obtain a spectacular gain in efficiency and scalability compared to existing procedures, and (3) to extend easily the decision procedure to the parametric case, which allows to obtain a new decidability result, and more importantly, a verification algorithm that is more general and more efficient in practice than the one for bounded instances

    Bridging the Gap between Programming Languages and Hardware Weak Memory Models

    Full text link
    We develop a new intermediate weak memory model, IMM, as a way of modularizing the proofs of correctness of compilation from concurrent programming languages with weak memory consistency semantics to mainstream multi-core architectures, such as POWER and ARM. We use IMM to prove the correctness of compilation from the promising semantics of Kang et al. to POWER (thereby correcting and improving their result) and ARMv7, as well as to the recently revised ARMv8 model. Our results are mechanized in Coq, and to the best of our knowledge, these are the first machine-verified compilation correctness results for models that are weaker than x86-TSO

    Remote-scope Promotion: Clarified, Rectified, and Verified

    Get PDF
    Modern accelerator programming frameworks, such as OpenCL, organise threads into work-groups. Remote-scope promotion (RSP) is a language extension recently proposed by AMD researchers that is designed to enable applications, for the first time, both to optimise for the common case of intra-work-group communication (using memory scopes to provide consistency only within a work-group) and to allow occasional inter-work-group communication (as required, for instance, to support the popular load-balancing idiom of work stealing). We present the first formal, axiomatic memory model of OpenCL extended with RSP. We have extended the Herd memory model simulator with support for OpenCL kernels that exploit RSP, and used it to discover bugs in several litmus tests and a work-stealing queue, that have been used previously in the study of RSP. We have also formalised the proposed GPU implementation of RSP. The formalisation process allowed us to identify bugs in the description of RSP that could result in well-synchronised programs experiencing memory inconsistencies. We present and prove sound a new implementation of RSP that incorporates bug fixes and requires less non-standard hardware than the original implementation. This work, a collaboration between academia and industry, clearly demonstrates how, when designing hardware support for a new concurrent language feature, the early application of formal tools and techniques can help to prevent errors, such as those we have found, from making it into silicon

    Concurrent Library Correctness on the TSO Memory Model

    Full text link
    Abstract. Linearizability is a commonly accepted notion of correctness for li-braries of concurrent algorithms. Unfortunately, it is only appropriate for se-quentially consistent memory models, while the hardware and software platforms that algorithms run on provide weaker consistency guarantees. In this paper, we present the first definition of linearizability on a weak memory model, Total Store Order (TSO), implemented by x86 processors. We establish that our definition is a correct one in the following sense: while proving a property of a client of a con-current library, we can soundly replace the library by its abstract implementation related to the original one by our generalisation of linearizability. This allows ab-stracting from the details of the library implementation while reasoning about the client. We have developed a tool for systematically testing concurrent libraries against our definition and applied it to several challenging algorithms.

    Verification of a lazy cache coherence protocol against a weak memory model

    Get PDF
    In this paper we verify a modern lazy cache coherence protocol, TSO-CC, against the memory consistency model it was designed for, TSO. We achieve this by first showing a weak simulation relation between TSO-CC (with a fixed number of processors) and a novel finite-state operational model which exhibits the laziness of TSO-CC and satisfies TSO. We then extend this by an existing parameterisation technique, allowing verification for an unlimited number of processors. The approach is executed entirely within a model checker, no external tool is required and very little in-depth knowledge of formal verification methods is required of the verifier.Comment: 10 page
    • …
    corecore