Search CORE

19 research outputs found

A Safety-First Approach to Memory Models.

Author: Singh Abhayendra Narayan
Publication venue
Publication date: 01/01/2016
Field of study

Sequential consistency (SC) is arguably the most intuitive behavior for a shared-memory multithreaded program. It is widely accepted that language-level SC could significantly improve programmability of a multiprocessor system. However, efficiently supporting end-to-end SC remains a challenge as it requires that both compiler and hardware optimizations preserve SC semantics. Current concurrent languages support a relaxed memory model that requires programmers to explicitly annotate all memory accesses that can participate in a data-race ("unsafe" accesses). This requirement allows compiler and hardware to aggressively optimize unannotated accesses, which are assumed to be data-race-free ("safe" accesses), while still preserving SC semantics. However, unannotated data races are easy for programmers to accidentally introduce and are difficult to detect, and in such cases the safety and correctness of programs are significantly compromised. This dissertation argues instead for a safety-first approach, whereby every memory operation is treated as potentially unsafe by the compiler and hardware unless it is proven otherwise. The first solution, DRFx memory model, allows many common compiler and hardware optimizations (potentially SC-violating) on unsafe accesses and uses a runtime support to detect potential SC violations arising from reordering of unsafe accesses. On detecting a potential SC violation, execution is halted before the safety property is compromised. The second solution takes a different approach and preserves SC in both compiler and hardware. Both SC-preserving compiler and hardware are also built on the safety-first approach. All memory accesses are treated as potentially unsafe by the compiler and hardware. SC-preserving hardware relies on different static and dynamic techniques to identify safe accesses. Our results indicate that supporting SC at the language level is not expensive in terms of performance and hardware complexity. The dissertation also explores an extension of this safety-first approach for data-parallel accelerators such as Graphics Processing Units (GPUs). Significant microarchitectural differences between CPU and GPU require rethinking of efficient solutions for preserving SC in GPUs. The proposed solution based on our SC-preserving approach performs nearly on par with the baseline GPU that implements a data-race-free-0 memory model.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/120794/1/ansingh_1.pd

Deep Blue Documents at the University of Michigan

End-to-End Sequential Consistency

Author: Abhayendra Singh
Daniel Marino
Madanlal Musuvathi
Satish Narayanasamy
Todd Millstein
Publication venue
Publication date: 01/01/2012
Field of study

CiteSeerX

Crossref

Asymmetric Memory Fences

Author: Douglas
Lee Jaejin
Minh Chi Cao
Reinders James
Singh Abhayendra
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

A Case for an SC-Preserving Compiler

Author: Abhayendra Singh
Daniel Marino
Madanlal Musuvathi
Satish Narayanasamy
Todd Millstein
Publication venue
Publication date: 01/01/2011
Field of study

The most intuitive memory consistency model for shared-memory multi-threaded programming is sequential consistency (SC). However, current concurrent programming languages support a relaxed model, as such relaxations are deemed necessary for enabling important optimizations. This paper demonstrates that an SC-preserving compiler, one that ensures that every SC behavior of a compiler-generated binary is an SC behavior of the source program, retains most of the performance benefits of an optimizing compiler. The key observation is that a large class of optimizations crucial for performance are either already SC-preserving or can be modified to preserve SC while retaining much of their effectiveness. An SC-preserving compiler, obtained by restricting the optimization phases in LLVM, a state-of-the-art C/C++ compiler, incurs an average slowdown of 3.8 % and a maximum slowdown of 34 % on a set of 30 programs from the SPLASH-2, PARSEC, and SPEC CINT2006 benchmark suites. While the performance overhead of preserving SC in the compiler is much less than previously assumed, it might still be unacceptable for certain applications. We believe there are several avenues for improving performance without giving up SC-preservation. In this vein, we observe that the overhead of our SC-preserving compiler arises mainly from its inability to aggressively perform a class of optimizations we identify as eager-load optimizations. This class includes common-subexpression elimination, constant propagation, global value numbering, and common cases of loop-invariant code motion. We propose a notion of interference checks in order to enable eager-load optimizations while preserving SC. Interference checks expose to the compiler a commonly used hardware speculation mechanism that can efficiently detect whether a particular variable has changed its value since last read

CiteSeerX

Crossref

Efficient Processor Support for DRFx, a Memory Model with Exceptions

Author: Abhayendra Singh
Daniel Marino
Madanlal Musuvathi
Satish Narayanasamy
Todd Millstein
Publication venue
Publication date: 01/01/2011
Field of study

A longstanding challenge of shared-memory concurrency is to provide a memory model that allows for efficient implementation while providing strong and simple guarantees to programmers. The C++0x and Java memory models admit a wide variety of compiler and hardware optimizations and provide sequentially consistent (SC) semantics for data-race-free programs. However, they either do not provide any semantics (C++0x) or provide a hard-tounderstand semantics (Java) for racy programs, compromising the safety and debuggability of such programs. In earlier work we proposed the DRFx memory model, which addresses this problem by dynamically detecting potential violations of SC due to the interaction of compiler or hardware optimizations with data races and halting execution upon detection. In this paper, we present a detailed micro-architecture design for supporting the DRFx memory model, formalize the design and prove its correctness, and evaluate the design using a hardware simulator. We describe a set of DRFx-compliant complexity-effective optimizations which allow us to attain performance close to that of TSO (Total Store Model) and DRF0 while providing strong guarantees for all programs

CiteSeerX

Crossref

Recommended from our members

DRFx: An Understandable, High Performance, and Flexible Memory Model for Concurrent Languages

Author: Marino Daniel
Millstein Todd
Musuvathi Madanlal
Narayanasamy Satish
Singh Abhayendra
Publication venue: eScholarship, University of California
Publication date: 01/01/2016
Field of study

The most intuitive memory model for shared-memory multi-threaded programming is sequential consistency (SC), but it disallows the use of many compiler and hardware optimizations and thus affects performance. Data-race-free (DRF) models, such as the C++11 memory model, guarantee SC execution for data-race-free programs. But these models provide no guarantee at all for racy programs, compromising the safety and debuggability of such programs. To address the safety issue, the Java memory model, which is also based on the DRF model, provides a weak semantics for racy executions. However, this semantics is subtle and complex, making it difficult for programmers to reason about their programs and for compiler writers to ensure the correctness of compiler optimizations. We present the drf x memory model, which is simple for programmers to understand and use while still supporting many common optimizations. We introduce a memory model (MM) exception that can be signaled to halt execution. If a program executes without throwing this exception, then drf x guarantees that the execution is SC. If a program throws an MM exception during an execution, then drf x guarantees that the program has a data race. We observe that SC violations can be detected in hardware through a lightweight form of conflict detection. Furthermore, our model safely allows aggressive compiler and hardware optimizations within compiler-designated program regions. We formalize our memory model, prove several properties of this model, describe a compiler and hardware design suitable for drf x , and evaluate the performance overhead due to our compiler and hardware requirements

eScholarship - University of California

Recommended from our members

DRFx: An Understandable, High Performance, and Flexible Memory Model for Concurrent Languages.

Author: Marino Daniel
Millstein Todd D
Musuvathi Madanlal
Narayanasamy Satish
Singh Abhayendra
Publication venue: eScholarship, University of California
Publication date: 01/01/2016
Field of study

eScholarship - University of California