19 research outputs found

    Verifying Parallel Loops with Separation Logic

    Get PDF
    This paper proposes a technique to specify and verify whether a loop can be parallelised. Our approach can be used as an additional step in a parallelising compiler to verify user annotations about loop dependences. Essentially, our technique requires each loop iteration to be specified with the locations it will read and write. From the loop iteration specifications, the loop (in)dependences can be derived. Moreover, the loop iteration specifications also reveal where synchronisation is needed in the parallelised program. The loop iteration specifications can be verified using permission-based separation logic.Comment: In Proceedings PLACES 2014, arXiv:1406.331

    Correct and efficient accelerator programming

    Get PDF
    This report documents the program and the outcomes of Dagstuhl Seminar 13142 “Correct and Efficient Accelerator Programming”. The aim of this Dagstuhl seminar was to bring together researchers from various sub-disciplines of computer science to brainstorm and discuss the theoretical foundations, design and implementation of techniques and tools for correct and efficient accelerator programming

    The VerCors tool for verification of concurrent programs

    Get PDF
    The VerCors tool implements thread-modular static verification of concurrent programs, annotated with functional properties and heap access permissions. The tool supports both generic multithreaded and vector-based programming models. In particular, it can verify multithreaded programs written in Java, specified with JML extended with separation logic. It can also verify parallelizable programs written in a toy language that supports the characteristic features of OpenCL. The tool verifies programs by first encoding the specified program into a much simpler programming language and then applying the Chalice verifier to the simplified program. In this paper we discuss both the implementation of the tool and the features of its specification language

    Engineering a static verification tool for GPU kernels

    Get PDF
    We report on practical experiences over the last 2.5 years related to the engineering of GPUVerify, a static verification tool for OpenCL and CUDA GPU kernels, plotting the progress of GPUVerify from a prototype to a fully functional and relatively efficient analysis tool. Our hope is that this experience report will serve the verification community by helping to inform future tooling efforts. © 2014 Springer International Publishing

    Contract-Based General-Purpose GPU Programming

    Get PDF
    Using GPUs as general-purpose processors has revolutionized parallel computing by offering, for a large and growing set of algorithms, massive data-parallelization on desktop machines. An obstacle to widespread adoption, however, is the difficulty of programming them and the low-level control of the hardware required to achieve good performance. This paper suggests a programming library, SafeGPU, that aims at striking a balance between programmer productivity and performance, by making GPU data-parallel operations accessible from within a classical object-oriented programming language. The solution is integrated with the design-by-contract approach, which increases confidence in functional program correctness by embedding executable program specifications into the program text. We show that our library leads to modular and maintainable code that is accessible to GPGPU non-experts, while providing performance that is comparable with hand-written CUDA code. Furthermore, runtime contract checking turns out to be feasible, as the contracts can be executed on the GPU

    Specification and verification of atomic operations in GPGPU programs

    Get PDF
    We propose a specification and verification technique based on separation logic to reason about data race freedom and functional correctness of GPU kernels that use atomic operations as synchronisation mechanism. Our approach exploits the notion of resource invariant from Concurrent Separation Logic (CSL) to capture the behaviour of atomic operations. However, because of the different memory levels in the GPU architecture, we adapt this notion of resource invariant to these memory levels, i.e., group resource invariants capture the behaviour of atomic operations that access locations in local memory, while kernel resource invariants capture the behaviour of atomic operations that access locations in global memory. We show soundness of our approach and we provide tool support that enables us to verify kernels from standard benchmarks suites

    Verification of loop parallelisations

    Get PDF
    Writing correct parallel programs becomes more and more difficult as the complexity and heterogeneity of processors increase. This issue is addressed by parallelising compilers. Various compiler directives can be used to tell these compilers where to parallelise. This paper addresses the correctness of such compiler directives for loop parallelisation. Specifically, we propose a technique based on separation logic to verify whether a loop can be parallelised. Our approach requires each loop iteration to be specified with the locations that are read and written in this iteration. If the specifications are correct, they can be used to draw conclusions about loop (in)dependences. Moreover, they also reveal where synchronisation is needed in the parallelised program. The loop iteration specifications can be verified using permission-based separation logic and seamlessly integrate with functional behaviour specifications. We formally prove the correctness of our approach and we discuss automated tool support for our technique. Additionally, we also discuss how the loop iteration contracts can be compiled into specifications for the code coming out of the parallelising compiler

    The GPUVerify Method: a Tutorial Overview

    Get PDF
    I present a tutorial overview demonstrating the key technique used by GPUVerify, a static verification tool for graphics processing unit (GPU) kernels.  The technique is a method for translating a massively parallel GPU kernel into a sequential program such that correctness of the sequential program implies data race-freedom of the parallel kernel
    corecore