27 research outputs found

    Overhauling SC atomics in C11 and OpenCL

    Get PDF
    Despite the conceptual simplicity of sequential consistency (SC), the semantics of SC atomic operations and fences in the C11 and OpenCL memory models is subtle, leading to convoluted prose descriptions that translate to complex axiomatic formalisations. We conduct an overhaul of SC atomics in C11, reducing the associated axioms in both number and complexity. A consequence of our simplification is that the SC operations in an execution no longer need to be totally ordered. This relaxation enables, for the first time, efficient and exhaustive simulation of litmus tests that use SC atomics. We extend our improved C11 model to obtain the first rigorous memory model formalisation for OpenCL (which extends C11 with support for heterogeneous many-core programming). In the OpenCL setting, we refine the SC axioms still further to give a sensible semantics to SC operations that employ a ‘memory scope’ to restrict their visibility to specific threads. Our overhaul requires slight strengthenings of both the C11 and the OpenCL memory models, causing some behaviours to become disallowed. We argue that these strengthenings are natural, and that all of the formalised C11 and OpenCL compilation schemes of which we are aware (Power and x86 CPUs for C11, AMD GPUs for OpenCL) remain valid in our revised models. Using the HERD memory model simulator, we show that our overhaul leads to an exponential improvement in simulation time for C11 litmus tests compared with the original model, making exhaustive simulation competitive, time-wise, with the non-exhaustive CDSChecker tool

    Automatically Optimizing the Latency, Area, and Accuracy of C Programs for High-Level Synthesis

    Get PDF
    ABSTRACT Loops are pervasive in numerical programs, so high-level synthesis (HLS) tools use state-of-the-art scheduling techniques to pipeline them efficiently. Still, the run time performance of the resultant FPGA implementation is limited by data dependences between loop iterations. Some of these dependence constraints can be alleviated by rewriting the program according to arithmetic identities (e.g. associativity and distributivity), memory access reductions, and control flow optimizations (e.g. partial loop unrolling). HLS tools cannot safely enable such rewrites by default because they may impact the accuracy of floating-point computations and increase area usage. In this paper, we introduce the first open-source program optimizer for automatically rewriting a given program to optimize latency while controlling for accuracy and area. Our tool, SOAP3, reports a multi-dimensional Pareto frontier that the programmer can use to resolve the trade-off according to their needs. When applied to a suite of PolyBench and Livermore Loops benchmarks, our tool has generated programs that enjoy up to a 12× speedup, with a simultaneous 7× increase in accuracy, at a cost of up to 4× more LUTs

    A Study of the Learnability of Relational Properties: Model Counting Meets Machine Learning (MCML)

    Full text link
    This paper introduces the MCML approach for empirically studying the learnability of relational properties that can be expressed in the well-known software design language Alloy. A key novelty of MCML is quantification of the performance of and semantic differences among trained machine learning (ML) models, specifically decision trees, with respect to entire (bounded) input spaces, and not just for given training and test datasets (as is the common practice). MCML reduces the quantification problems to the classic complexity theory problem of model counting, and employs state-of-the-art model counters. The results show that relatively simple ML models can achieve surprisingly high performance (accuracy and F1-score) when evaluated in the common setting of using training and test datasets - even when the training dataset is much smaller than the test dataset - indicating the seeming simplicity of learning relational properties. However, MCML metrics based on model counting show that the performance can degrade substantially when tested against the entire (bounded) input space, indicating the high complexity of precisely learning these properties, and the usefulness of model counting in quantifying the true performance

    Concurrent verification for sequential programs

    No full text
    This thesis is not available on this repository until the author agrees to make it public. If you are the author of this thesis and would like to make your work openly available, please contact us: [email protected] Library can supply a digital copy for private research purposes; interested parties should submit the request form here: http://www.lib.cam.ac.uk/collections/departments/digital-content-unit/ordering-imagesPlease note that print copies of theses may be available for consultation in the Cambridge University Library's Manuscript reading room. Admission details are at http://www.lib.cam.ac.uk/collections/departments/manuscripts-university-archivesThis dissertation makes two contributions to the field of software verification. The first explains how verification techniques originally developed for concurrency can be usefully applied to sequential programs. The second describes how sequential programs can be verified using diagrams that have a parallel nature. The first contribution involves a new treatment of stability in verification methods based on rely-guarantee. When an assertion made in one thread of a concurrent system cannot be invalidated by the actions of other threads, that assertion is said to be 'stable'. Stability is normally enforced through side-conditions on rely-guarantee proof rules. This dissertation proposes instead to encode stability information into the syntactic form of the assertion. This approach, which we call explicit stabilisation, brings several benefits. First, we empower rely-guarantee with the ability to reason about library code for the first time. Second, when the rely-guarantee method is redepleyed in a sequential setting, explicit stabilisation allows more details of a module's implementation to be hidden when verifying clients. Third, explicit stabilisation brings a more nuanced understanding of the important issue of stability in concurrent and sequential verification; such an understanding grows ever more important as verification techniques grow ever more complex. The second contribution is a new method of presenting program proofs conducted in separation logic. Building on work by Jules Bean, the ribbon proof is a diagrammatic alternative to the standard 'proof outline'. By emphasising the structure of a proof, ribbon proofs are intelligible and hence useful pedagogically. Because they contain less redundancy than proof outlines, and allow each proof step to be checked locally, they are highly scalable; this we illustrate with a ribbon proof of the Version 7 Unix memory manager. Where proof outlines are cumbersome to modify, ribbon proofs can be visually manoeuvred to yield proofs of variant programs. We describe the ribbon proof system, prove its soundness and completeness, and outline a prototype tool for mechanically checking the diagrams it produ1res

    Unifying models of data flow

    No full text
    We propose a model of computation, based on data flow, that unifies several disparate programming phenomena, including local and shared variables, synchronised and buffered communication, reliable and unreliable channels, dynamic and static allocation, explicit and garbage-collected disposal, fine-grained and coarse-grained concurrency, and weakly and strongly consistent memory
    corecore