Search CORE

43 research outputs found

Model Checking Race-freedom When "Sequential Consistency for Data-race-free Programs" is Guaranteed

Author: Hovland Paul D.
Hückelheim Jan
Luo Ziqing
Siegel Stephen F.
Wu Wenhao
Publication venue
Publication date: 29/05/2023
Field of study

Many parallel programming models guarantee that if all sequentially consistent (SC) executions of a program are free of data races, then all executions of the program will appear to be sequentially consistent. This greatly simplifies reasoning about the program, but leaves open the question of how to verify that all SC executions are race-free. In this paper, we show that with a few simple modifications, model checking can be an effective tool for verifying race-freedom. We explore this technique on a suite of C programs parallelized with OpenMP

arXiv.org e-Print Archive

Pervasive Parallel And Distributed Computing In A Liberal Arts College Curriculum

Author: Danner Andrew
Newhall Tia
Webb Kevin
Publication venue: 'Transformative Works and Cultures'
Publication date: 01/07/2017
Field of study

We present a model for incorporating parallel and distributed computing (PDC) throughout an undergraduate CS curriculum. Our curriculum is designed to introduce students early to parallel and distributed computing topics and to expose students to these topics repeatedly in the context of a wide variety of CS courses. The key to our approach is the development of a required intermediate-level course that serves as a introduction to computer systems and parallel computing. It serves as a requirement for every CS major and minor and is a prerequisite to upper-level courses that expand on parallel and distributed computing topics in different contexts. With the addition of this new course, we are able to easily make room in upper-level courses to add and expand parallel and distributed computing topics. The goal of our curricular design is to ensure that every graduating CS major has exposure to parallel and distributed computing, with both a breadth and depth of coverage. Our curriculum is particularly designed for the constraints of a small liberal arts college, however, much of its ideas and its design are applicable to any undergraduate CS curriculum

Works

Bridging Control-Centric and Data-Centric Optimization

Author: Ates Berke
Ben-Nun Tal
Calotoiu Alexandru
Hoefler Torsten
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/02/2023
Field of study

With the rise of specialized hardware and new programming languages, code optimization has shifted its focus towards promoting data locality. Most production-grade compilers adopt a control-centric mindset - instruction-driven optimization augmented with scalar-based dataflow - whereas other approaches provide domain-specific and general purpose data movement minimization, which can miss important control-flow optimizations. As the two representations are not commutable, users must choose one over the other. In this paper, we explore how both control- and data-centric approaches can work in tandem via the Multi-Level Intermediate Representation (MLIR) framework. Through a combination of an MLIR dialect and specialized passes, we recover parametric, symbolic dataflow that can be optimized within the DaCe framework. We combine the two views into a single pipeline, called DCIR, showing that it is strictly more powerful than either view. On several benchmarks and a real-world application in C, we show that our proposed pipeline consistently outperforms MLIR and automatically uncovers new optimization opportunities with no additional effort.Comment: CGO'2

arXiv.org e-Print Archive

Repository for Publications and Research Data

ORCA: Ordering-free Regions for Consistency and Atomicity

Author: DeLozier Christian
Devietti Joe
Eizenberg Ariel
Lucia Brandon
Peng Yuanfeng
Publication venue: ScholarlyCommons
Publication date: 28/04/2016
Field of study

Writing correct synchronization is one of the main difficulties of multithreaded programming. Incorrect synchronization causes many subtle concurrency errors such as data races and atomicity violations. Previous work has proposed stronger memory consistency models to rule out certain classes of concurrency bugs. However, these approaches are limited by a program’s original (and possibly incorrect) synchronization. In this work, we provide stronger guarantees than previous memory consistency models by punctuating atomicity only at ordering constructs like barriers, but not at lock operations. We describe the Ordering-free Regions for Consistency and Atomicity (ORCA) system which enforces atomicity at the granularity of ordering-free regions (OFRs). While many atomicity violations occur at finer granularity, in an empirical study of many large multithreaded workloads we find no examples of code that requires atomicity coarser than OFRs. Thus, we believe OFRs are a conservative approximation of the atomicity requirements of many programs. ORCA assists programmers by throwing an exception when OFR atomicity is threatened, and, in exception-free executions, guaranteeing that all OFRs execute atomically. In our evaluation, we show that ORCA automatically prevents real concurrency bugs. A user-study of ORCA demonstrates that synchronizing a program with ORCA is easier than using a data race detector. We evaluate modest hardware support that allows ORCA to run with just 18% slowdown on average over pthreads, with very similar scalability

ScholarlyCommons@Penn

Recommended from our members

A Sparse Learning Approach for Linux Kernel Data Race Prediction

Author: Ryan Gabriel
Publication venue
Publication date: 01/01/2023
Field of study

Operating system kernels rely on fine-grained concurrency to achieve optimal performance on modern multi-core processors. However, heavy usage of fine-grained concurrency mechanisms make modern operating system kernels prone to data races, which can cause severe and often elusive bugs. In this thesis, I propose a new approach to identifying data races in OS Kernels based on learning a model to predict which memory accesses can be feasibly executed concurrently with one another. To develop an efficient learning method for memory access feasibility, I develop a novel approach based on encoding feasibility as a boolean indicator function of system calls and ordered memory accesses. A memory access feasibility function encoded this way will have a naturally sparse latent representation due to the sparsity of interthread communications and synchronization interactions, and can therefore be accurately approximated based on a small number of observed concurrent execution traces. This thesis introduces two key contributions. First, Probabilistic Lockset Analysis (PLA), is a new analysis that exploits sparsity in input dependencies in conjunction with a conservative lockset analysis to efficiently predict data races in the Linux OS Kernel. Second, approximate happens-before analysis in the fourier domain (HBFourier) generalizes the approach used by PLA to reason about interthread memory communications and synchronization events through sparse fourier learning. In addition to being theoretically grounded, these techniques are highly practical: they find hundreds of races in a recent Linux development kernel, an order of magnitude improvement over prior work, and find races with severe security impacts that have been overlooked by existing kernel testing systems for years

Columbia University Academic Commons

The Silently Shifting Semicolon

Author: Marino Daniel
Millstein Todd
Musuvathi Madanlal
Narayanasamy Satish
Singh Abhayendra
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 1st Summit on Advances in Programming Languages (SNAPL 2015)
Publication date: 01/01/2015
Field of study

Memory consistency models for modern concurrent languages have largely been designed from a system-centric point of view that protects, at all costs, optimizations that were originally designed for sequential programs. The result is a situation that, when viewed from a programmer\u27s standpoint, borders on absurd. We illustrate this unfortunate situation with a brief fable and then examine the opportunities to right our path

Dagstuhl Research Online Publication Server

Exposing concurrency failures: a comprehensive survey of the state of the art and a novel approach to reproduce field failures

Author: Bianchi Francesco Adalberto
Pezzè Mauro
Publication venue
Publication date: 20/12/2018
Field of study

With the rapid advance of multi-core and distributed architectures, concurrent systems are becoming more and more popular. Concurrent systems are extremely hard to develop and validate, as their overall behavior depends on the non-deterministic interleaving of the execution flows that comprise the system. Wrong and unexpected interleavings may lead to concurrency faults that are extremely hard to avoid, detect, and fix due to their non-deterministic nature. This thesis addresses the problem of exposing concurrency failures. Exposing concurrency failures is a crucial activity to locate and fix the related fault and amounts to determine both a test case and an interleaving that trigger the failure. Given the high cost of manually identifying a failure-inducing test case and interleaving among the infinite number of inputs and interleavings of the system, the problem of automatically exposing concurrency failures has been studied by researchers since the late seventies and is still a hot research topic. This thesis advances the research in exposing concurrency failures by proposing two main contributions. The first contribution is a comprehensive survey and taxonomy of the state-of-the-art techniques for exposing concurrency failures. The taxonomy and survey provide a framework that captures the key features of the existing techniques, identify a set of classification criteria to review and compare them, and highlight their strengths and weaknesses, leading to a thorough assessment of the field and paving the road for future progresses. The second contribution of this thesis is a technique to automatically expose and reproduce concurrency field failure. One of the main findings of our survey is that automatically reproducing concurrency field failures is still an open problem, as the few techniques that have been proposed rely on information that may be hard to collect, and identify failure-inducing interleavings but do not synthesize failure-inducing test cases. We propose a technique that advances over state- of-the-art approaches by relying on information that is easily obtainable and by automatically identifying both a failure- inducing test case and interleaving. We empirically demonstrate the effectiveness of our approach on a benchmark of real concurrency failures taken from different popular code bases

RERO DOC Digital Library

Dynamic race detection for C++11

Author: Donaldson AF
Lidbury C
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/10/2016
Field of study

The intricate rules for memory ordering and synchronisation associated with the C/C++11 memory model mean that data races can be difficult to eliminate from concurrent programs. Dynamic data race analysis can pinpoint races in large and complex applications, but the state-of-the-art ThreadSanitizer (tsan) tool for C/C++ considers only sequentially consistent program executions, and does not correctly model synchronisation between C/C++11 atomic operations. We present a scalable dynamic data race analysis for C/C++11 that correctly captures C/C++11 synchronisation, and uses instrumentation to support exploration of a class of non sequentially consistent executions. We concisely define the memory model fragment captured by our instrumentation via a restricted axiomatic semantics, and show that the axiomatic semantics permits exactly those executions explored by our instrumentation. We have implemented our analysis in tsan, and evaluate its effectiveness on benchmark programs, enabling a comparison with the CDSChecker tool, and on two large and highly concurrent applications: the Firefox and Chromium web browsers. Our results show that our method can detect races that are beyond the scope of the original tsan tool, and that the overhead associated with applying our enhanced instrumentation to large applications is tolerable

Spiral - Imperial College Digital Repository