10,154 research outputs found

    Privatization-Safe Transactional Memories

    Get PDF
    Transactional memory (TM) facilitates the development of concurrent applications by letting the programmer designate certain code blocks as atomic. Programmers using a TM often would like to access the same data both inside and outside transactions, and would prefer their programs to have a strongly atomic semantics, which allows transactions to be viewed as executing atomically with respect to non-transactional accesses. Since guaranteeing such semantics for arbitrary programs is prohibitively expensive, researchers have suggested guaranteeing it only for certain data-race free (DRF) programs, particularly those that follow the privatization idiom: from some point on, threads agree that a given object can be accessed non-transactionally. In this paper we show that a variant of Transactional DRF (TDRF) by Dalessandro et al. is appropriate for a class of privatization-safe TMs, which allow using privatization idioms. We prove that, if such a TM satisfies a condition we call privatization-safe opacity and a program using the TM is TDRF under strongly atomic semantics, then the program indeed has such semantics. We also present a method for proving privatization-safe opacity that reduces proving this generalization to proving the usual opacity, and apply the method to a TM based on two-phase locking and a privatization-safe version of TL2. Finally, we establish the inherent cost of privatization-safety: we prove that a TM cannot be progressive and have invisible reads if it guarantees strongly atomic semantics for TDRF programs

    Reasoning algebraically about refinement on TSO architectures

    Get PDF
    The Total Store Order memory model is widely implemented by modern multicore architectures such as x86, where local buffers are used for optimisation, allowing limited forms of instruction reordering. The presence of buffers and hardware-controlled buffer flushes increases the level of non-determinism from the level specified by a program, complicating the already difficult task of concurrent programming. This paper presents a new notion of refinement for weak memory models, based on the observation that pending writes to a process' local variables may be treated as if the effect of the update has already occurred in shared memory. We develop an interval-based model with algebraic rules for various programming constructs. In this framework, several decomposition rules for our new notion of refinement are developed. We apply our approach to verify the spinlock algorithm from the literature

    How functional programming mattered

    Get PDF
    In 1989 when functional programming was still considered a niche topic, Hughes wrote a visionary paper arguing convincingly ‘why functional programming matters’. More than two decades have passed. Has functional programming really mattered? Our answer is a resounding ‘Yes!’. Functional programming is now at the forefront of a new generation of programming technologies, and enjoying increasing popularity and influence. In this paper, we review the impact of functional programming, focusing on how it has changed the way we may construct programs, the way we may verify programs, and fundamentally the way we may think about programs

    The Parallel Persistent Memory Model

    Full text link
    We consider a parallel computational model that consists of PP processors, each with a fast local ephemeral memory of limited size, and sharing a large persistent memory. The model allows for each processor to fault with bounded probability, and possibly restart. On faulting all processor state and local ephemeral memory are lost, but the persistent memory remains. This model is motivated by upcoming non-volatile memories that are as fast as existing random access memory, are accessible at the granularity of cache lines, and have the capability of surviving power outages. It is further motivated by the observation that in large parallel systems, failure of processors and their caches is not unusual. Within the model we develop a framework for developing locality efficient parallel algorithms that are resilient to failures. There are several challenges, including the need to recover from failures, the desire to do this in an asynchronous setting (i.e., not blocking other processors when one fails), and the need for synchronization primitives that are robust to failures. We describe approaches to solve these challenges based on breaking computations into what we call capsules, which have certain properties, and developing a work-stealing scheduler that functions properly within the context of failures. The scheduler guarantees a time bound of O(W/PA+D(P/PA)log1/fW)O(W/P_A + D(P/P_A) \lceil\log_{1/f} W\rceil) in expectation, where WW and DD are the work and depth of the computation (in the absence of failures), PAP_A is the average number of processors available during the computation, and f1/2f \le 1/2 is the probability that a capsule fails. Within the model and using the proposed methods, we develop efficient algorithms for parallel sorting and other primitives.Comment: This paper is the full version of a paper at SPAA 2018 with the same nam

    The design and implementation of a verification technique for GPU Kernels

    Get PDF
    We present a technique for the formal verification of GPU kernels, addressing two classes of correctness properties: data races and barrier divergence. Our approach is founded on a novel formal operational semantics for GPU kernels termed synchronous, delayed visibility (SDV) semantics, which captures the execution of a GPU kernel by multiple groups of threads. The SDV semantics provides operational definitions for barrier divergence and for both inter- and intra-group data races. We build on the semantics to develop a method for reducing the task of verifying a massively parallel GPU kernel to that of verifying a sequential program. This completely avoids the need to reason about thread interleavings, and allows existing techniques for sequential program verification to be leveraged. We describe an efficient encoding of data race detection and propose a method for automatically inferring the loop invariants that are required for verification. We have implemented these techniques as a practical verification tool, GPUVerify, that can be applied directly to OpenCL and CUDA source code. We evaluate GPUVerify with respect to a set of 162 kernels drawn from public and commercial sources. Our evaluation demonstrates that GPUVerify is capable of efficient, automatic verification of a large number of real-world kernels

    IST Austria Thesis

    Get PDF
    Designing and verifying concurrent programs is a notoriously challenging, time consuming, and error prone task, even for experts. This is due to the sheer number of possible interleavings of a concurrent program, all of which have to be tracked and accounted for in a formal proof. Inventing an inductive invariant that captures all interleavings of a low-level implementation is theoretically possible, but practically intractable. We develop a refinement-based verification framework that provides mechanisms to simplify proof construction by decomposing the verification task into smaller subtasks. In a first line of work, we present a foundation for refinement reasoning over structured concurrent programs. We introduce layered concurrent programs as a compact notation to represent multi-layer refinement proofs. A layered concurrent program specifies a sequence of connected concurrent programs, from most concrete to most abstract, such that common parts of different programs are written exactly once. Each program in this sequence is expressed as structured concurrent program, i.e., a program over (potentially recursive) procedures, imperative control flow, gated atomic actions, structured parallelism, and asynchronous concurrency. This is in contrast to existing refinement-based verifiers, which represent concurrent systems as flat transition relations. We present a powerful refinement proof rule that decomposes refinement checking over structured programs into modular verification conditions. Refinement checking is supported by a new form of modular, parameterized invariants, called yield invariants, and a linear permission system to enhance local reasoning. In a second line of work, we present two new reduction-based program transformations that target asynchronous programs. These transformations reduce the number of interleavings that need to be considered, thus reducing the complexity of invariants. Synchronization simplifies the verification of asynchronous programs by introducing the fiction, for proof purposes, that asynchronous operations complete synchronously. Synchronization summarizes an asynchronous computation as immediate atomic effect. Inductive sequentialization establishes sequential reductions that captures every behavior of the original program up to reordering of coarse-grained commutative actions. A sequential reduction of a concurrent program is easy to reason about since it corresponds to a simple execution of the program in an idealized synchronous environment, where processes act in a fixed order and at the same speed. Our approach is implemented the CIVL verifier, which has been successfully used for the verification of several complex concurrent programs. In our methodology, the overall correctness of a program is established piecemeal by focusing on the invariant required for each refinement step separately. While the programmer does the creative work of specifying the chain of programs and the inductive invariant justifying each link in the chain, the tool automatically constructs the verification conditions underlying each refinement step

    Invariant Synthesis for Incomplete Verification Engines

    Full text link
    We propose a framework for synthesizing inductive invariants for incomplete verification engines, which soundly reduce logical problems in undecidable theories to decidable theories. Our framework is based on the counter-example guided inductive synthesis principle (CEGIS) and allows verification engines to communicate non-provability information to guide invariant synthesis. We show precisely how the verification engine can compute such non-provability information and how to build effective learning algorithms when invariants are expressed as Boolean combinations of a fixed set of predicates. Moreover, we evaluate our framework in two verification settings, one in which verification engines need to handle quantified formulas and one in which verification engines have to reason about heap properties expressed in an expressive but undecidable separation logic. Our experiments show that our invariant synthesis framework based on non-provability information can both effectively synthesize inductive invariants and adequately strengthen contracts across a large suite of programs
    corecore