Search CORE

14 research outputs found

SmartTrack: Efficient Predictive Race Detection

Author: Biswas Swarnendu
Biswas Swarnendu
Blackshear Sam
Boehm J.
Boehm J.
Bond Michael D.
Cao Man
Flanagan Cormac
Flanagan Cormac
Flanagan Cormac
Flanagan Cormac
Genç Kaan
Gorogiannis Nikos
Huang Jeff
Huang Jeff
Huang Shiyou
Kasikci Baris
Liu Peng
Luo Peng
Manson Jeremy
Mattern Friedemann
Pozniansky Eli
Roemer Jake
Roemer Jake
Roemer Jake
Segulja Cedomir
von Praun Christoph
Wood Benjamin P.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 08/04/2020
Field of study

Widely used data race detectors, including the state-of-the-art FastTrack algorithm, incur performance costs that are acceptable for regular in-house testing, but miss races detectable from the analyzed execution. Predictive analyses detect more data races in an analyzed execution than FastTrack detects, but at significantly higher performance cost. This paper presents SmartTrack, an algorithm that optimizes predictive race detection analyses, including two analyses from prior work and a new analysis introduced in this paper. SmartTrack's algorithm incorporates two main optimizations: (1) epoch and ownership optimizations from prior work, applied to predictive analysis for the first time; and (2) novel conflicting critical section optimizations introduced by this paper. Our evaluation shows that SmartTrack achieves performance competitive with FastTrack-a qualitative improvement in the state of the art for data race detection.Comment: Extended arXiv version of PLDI 2020 paper (adds Appendices A-E) #228 SmartTrack: Efficient Predictive Race Detectio

arXiv.org e-Print Archive

Crossref

AI: a lightweight system for tolerating concurrency bugs

Author
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

Crossref

Pacman: Tolerating asymmetric data races with unintrusive hardware

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Uniparallel Execution and its Uses.

Author: Veeraraghavan Kaushik
Publication venue
Publication date: 01/01/2011
Field of study

We introduce uniparallelism: a new style of execution that allows multithreaded applications to benefit from the simplicity of uniprocessor execution while scaling performance with increasing processors. A uniparallel execution consists of a thread-parallel execution, where each thread runs on its own processor, and an epoch-parallel execution, where multiple time intervals (epochs) of the program run concurrently. The epoch-parallel execution runs all threads of a given epoch on a single processor; this enables the use of techniques that are effective on a uniprocessor. To scale performance with increasing cores, a thread-parallel execution runs ahead of the epoch-parallel execution and generates speculative checkpoints from which to start future epochs. If these checkpoints match the program state produced by the epoch-parallel execution at the end of each epoch, the speculation is committed and output externalized; if they mismatch, recovery can be safely initiated as no speculative state has been externalized. We use uniparallelism to build two novel systems: DoublePlay and Frost. DoublePlay benefits from the efficiency of logging the epoch-parallel execution (as threads in an epoch are constrained to a single processor, only infrequent thread context-switches need to be logged to recreate the order of shared-memory accesses), allowing it to outperform all prior systems that guarantee deterministic replay on commodity multiprocessors. While traditional methods detect data races by analyzing the events executed by a program, Frost introduces a new, substantially faster method called outcome-based race detection to detect the effects of a data race by comparing the program state of replicas for divergences. Unlike DoublePlay, which runs a single epoch-parallel execution of the program, Frost runs multiple epoch-parallel replicas with complementary schedules, which are a set of thread schedules crafted to ensure that replicas diverge only if a data race occurs and to make it very likely that harmful data races cause divergences. Frost detects divergences by comparing the outputs and memory states of replicas at the end of each epoch. Upon detecting a divergence, Frost analyzes the replica outcomes to diagnose the data race bug and selects an appropriate recovery strategy that masks the failure.Ph.D.Computer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/89677/1/kaushikv_1.pd

CiteSeerX

Deep Blue Documents at the University of Michigan

Holistic System Design for Deterministic Replay.

Author: Lee Dongyoon
Publication venue
Publication date: 01/01/2013
Field of study

Deterministic replay systems record and reproduce the execution of a hardware or software system. While it is well known how to replay uniprocessor systems, it is much harder to provide deterministic replay of shared memory multithreaded programs on multiprocessors because shared memory accesses add a high-frequency source of non-determinism. This thesis proposes efficient multiprocessor replay systems: Respec, Chimera, and Rosa. Respec is an operating-system-based replay system. Respec is based on the observation that most program executions are data-race-free and for programs with no data races it is sufficient to record program input and the happens-before order of synchronization operations for replay. Respec speculates that a program is data-race-free and supports rollback and recovery from misspeculation. For racy programs, Respec employs a cheap runtime check that compares system call outputs and memory/register states of recorded and replayed processes at a semi-regular interval. Chimera uses a sound static data race detector to find all potential data races and instrument pairs of potentially racing instructions to transform an arbitrary program to make it data-race-free. Then, Chimera records only the non-deterministic inputs and the order of synchronization operations for replay. However, existing static data race detectors generate excessive false warnings, leading to high recording overhead. Chimera resolves this problem by employing a combination of profiling, symbolic analysis, and dynamic checks that target the sources of imprecision in the static data race detector. Rosa is a processor-based ultra-low overhead (less than one percent) replay solution that requires very little hardware support as it essentially only needs a log of cache misses to reproduce a multiprocessor execution. Unlike previous hardware-assisted systems, Rosa does not record shared memory dependencies at all. Instead, it infers them offline using a Satisfiability Modulo Theories (SMT) solver. Our offline analysis is capable of inferring interleavings that are legal under the Sequentially Consistency (SC) and Total Store Order (TSO) memory models.PhDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/102374/1/dongyoon_1.pd

Deep Blue Documents at the University of Michigan

Dynamic Analysis of Embedded Software

Author
Publication venue
Publication date: 01/01/2015
Field of study

abstract: Most embedded applications are constructed with multiple threads to handle concurrent events. For optimization and debugging of the programs, dynamic program analysis is widely used to collect execution information while the program is running. Unfortunately, the non-deterministic behavior of multithreaded embedded software makes the dynamic analysis difficult. In addition, instrumentation overhead for gathering execution information may change the execution of a program, and lead to distorted analysis results, i.e., probe effect. This thesis presents a framework that tackles the non-determinism and probe effect incurred in dynamic analysis of embedded software. The thesis largely consists of three parts. First of all, we discusses a deterministic replay framework to provide reproducible execution. Once a program execution is recorded, software instrumentation can be safely applied during replay without probe effect. Second, a discussion of probe effect is presented and a simulation-based analysis is proposed to detect execution changes of a program caused by instrumentation overhead. The simulation-based analysis examines if the recording instrumentation changes the original program execution. Lastly, the thesis discusses data race detection algorithms that help to remove data races for correctness of the replay and the simulation-based analysis. The focus is to make the detection efficient for C/C++ programs, and to increase scalability of the detection on multi-core machines.Dissertation/ThesisDoctoral Dissertation Computer Science 201

ASU Digital Repository

Detecting and Surviving Data Races using Complementary Schedules

Author: Jason Flinn
Kaushik Veeraraghavan
Peter M. Chen
Satish Narayanasamy
Publication venue
Publication date: 01/01/2011
Field of study

Data races are a common source of errors in multithreaded programs. In this paper, we show how to protect a program from data race errors at runtime by executing multiple replicas of the program with complementary thread schedules. Complementary schedules are a set of replica thread schedules crafted to ensure that replicas diverge only if a data race occurs and to make it very likely that harmful data races cause divergences. Our system, called Frost 1, uses complementary schedules to cause at least one replica to avoid the order of racing instructions that leads to incorrect program execution for most harmful data races. Frost introduces outcome-based race detection, which detects data races by comparing the state of replicas executing complementary schedules. We show that this method is substantiall

CiteSeerX

Crossref

Exposing concurrency failures: a comprehensive survey of the state of the art and a novel approach to reproduce field failures

Author: Bianchi Francesco Adalberto
Pezzè Mauro
Publication venue
Publication date: 20/12/2018
Field of study

With the rapid advance of multi-core and distributed architectures, concurrent systems are becoming more and more popular. Concurrent systems are extremely hard to develop and validate, as their overall behavior depends on the non-deterministic interleaving of the execution flows that comprise the system. Wrong and unexpected interleavings may lead to concurrency faults that are extremely hard to avoid, detect, and fix due to their non-deterministic nature. This thesis addresses the problem of exposing concurrency failures. Exposing concurrency failures is a crucial activity to locate and fix the related fault and amounts to determine both a test case and an interleaving that trigger the failure. Given the high cost of manually identifying a failure-inducing test case and interleaving among the infinite number of inputs and interleavings of the system, the problem of automatically exposing concurrency failures has been studied by researchers since the late seventies and is still a hot research topic. This thesis advances the research in exposing concurrency failures by proposing two main contributions. The first contribution is a comprehensive survey and taxonomy of the state-of-the-art techniques for exposing concurrency failures. The taxonomy and survey provide a framework that captures the key features of the existing techniques, identify a set of classification criteria to review and compare them, and highlight their strengths and weaknesses, leading to a thorough assessment of the field and paving the road for future progresses. The second contribution of this thesis is a technique to automatically expose and reproduce concurrency field failure. One of the main findings of our survey is that automatically reproducing concurrency field failures is still an open problem, as the few techniques that have been proposed rely on information that may be hard to collect, and identify failure-inducing interleavings but do not synthesize failure-inducing test cases. We propose a technique that advances over state- of-the-art approaches by relying on information that is easily obtainable and by automatically identifying both a failure- inducing test case and interleaving. We empirically demonstrate the effectiveness of our approach on a benchmark of real concurrency failures taken from different popular code bases

RERO DOC Digital Library

Recommended from our members

Sound and Precise Analysis of Multithreaded Programs through Schedule Specialization and Execution Filters

Author: Wu Jingyue
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2014
Field of study

Multithreaded programs are known to be difficult to analyze. A key reason is that they typically have an enormous number of execution interleavings, or schedules. Static analysis with respect to all schedules requires over-approximation, resulting in poor precision; dynamic analysis rarely covers more than a tiny fraction of all schedules, so its result may not hold for schedules not covered. To address this challenge, we propose a novel approach called schedule specialization that restricts the schedules of a program to make it easier to analyze. Schedule specialization combines static and dynamic analysis. It first statically analyzes a multithreaded program with respect to a small set of schedules for precision, and then enforces these schedules at runtime for soundness of the static analysis results. To demonstrate that this approach works, we build three systems. The first system is a specialization framework that specializes a program into a simpler program based on a schedule for precision. It allows stock analyses to automatically gain precision with only little modification. The second system is Peregrine, a deterministic multithreading system that collects and enforces schedules on future inputs. Peregrine reuses a small set of schedules on many inputs, ensuring our static analysis results to be sound for a wide range of inputs. It also enforces these schedules efficiently, making schedule specialization suitable for production usage. Although schedule specialization can make static concurrency error detection more precise, some concurrency errors such as races may still slip detection and enter production systems. To mitigate this limitation, we build Loom, a live-workaround system that protects a live multithreaded program from races that slip detection. It allows developers to easily write execution filters to safely and efficiently work around deployed races in live multithreaded programs without restarting them

Columbia University Academic Commons