Search CORE

21 research outputs found

Adding Token Counting to Directory-Based Cache Coherence

Author: Blundell Colin
Martin Milo M.K.
Raghavan Arun
Publication venue: ScholarlyCommons
Publication date: 04/06/2008
Field of study

The coherence protocol is a first-order design concern in multicore designs. Directory protocols are naturally scalable, as they place no restrictions on the interconnect and have minimal bandwidth requirements; however, this scalability comes at the cost of increased sharing latency due to indirection. In contrast, broadcast-based systems such as snooping protocols and token coherence reduce latency of sharing misses by sending requests directly to other processors. Unfortunately, their reliance on totally ordered interconnects and/or broadcast limits their scalability. This work introduces PATCH (Predictive/Adaptive Token Counting Hybrid), a coherence protocol that provides the scalability of directory protocols while opportunistically using available bandwidth to reduce sharing latency. PATCH extends a standard directory protocol to track tokens and use token counting rules for enforcing coherence permissions. Token counting allows PATCH to support direct requests on an unordered interconnect, while a novel mechanism called token tenure uses local processor timeouts and the directory’s per-block point of ordering at the home node to guarantee forward progress without relying on broadcast. PATCH makes three main contributions. First, PATCH uses direct request prioritization to match the performance of broadcast-based protocols without restricting scalability. Second, PATCH introduces token tenure, which provides broadcast-free forward progress for token counting protocols. Finally, PATCH provides greater scalability than directory protocols when using inexact encodings of sharers because only processors holding tokens need to acknowledge requests. Overall, PATCH is a “one-size-fits-all” coherence protocol that dynamically adapts to work well for small systems, large systems, and anywhere in betwee

CiteSeerX

ScholarlyCommons@Penn

Improved Sequence-Based Speculation Techniques for Implementing Memory Consistency

Author: Blundell Colin
Martin Milo M.K.
Wenisch Tom
Publication venue: ScholarlyCommons
Publication date: 27/05/2008
Field of study

This work presents BMW, a new design for speculative implementations of memory consistency models in shared-memory multiprocessors. BMW obtains the same performance as prior proposals, but achieves this performance while avoiding several undesirable attributes of prior proposals: non-scalable structures, per-word valid bits in the data cache, modifications to the cache coherence protocol, and global arbitration. BMW uses a read and write bit per cache block and a standard invalidation-based cache coherence protocol to perform conflict detection while speculating. While speculating, stores to block not in the cache are placed into a coalescing store buffer until those misses return. Stores are written speculatively to the primary cache, and non-speculative state is maintained by cleaning dirty blocks before being written speculatively. Speculative blocks are invalidated on abort and marked as non-speculative on commit. This organization allows for fast, local commits while avoiding a non-scalable store queue

ScholarlyCommons@Penn

Generating Litmus Tests for Contrasting Memory Consistency Models - Extended Version

Author: Alur Rajeev
Mador-Haim Sela
Martin Milo M.K.
Publication venue: ScholarlyCommons
Publication date: 01/01/2010
Field of study

Well-defined memory consistency models are necessary for writing correct parallel software. Developing and understanding formal specifications of hardware memory models is a challenge due to the subtle differences in allowed reorderings and different specification styles. To facilitate exploration of memory model specifications, we have developed a technique for systematically comparing hardware memory models specified using both operational and axiomatic styles. Given two specifications, our approach generates all possible multi-threaded programs up to a specified bound, and for each such program, checks if one of the models can lead to an observable behavior not possible in the other model. When the models differs, the tool finds a minimal “litmus test” program that demonstrates the difference. A number of optimizations reduce the number of programs that need to be examined. Our prototype implementation has successfully compared both axiomatic and operational specifications of six different hardware memory models. We describe two case studies: (1) development of a non-store atomic variant of an existing memory model, which illustrates the use of the tool while developing a new memory model, and (2) identification of a subtle specification mistake in a recently published axiomatic specification of TSO

ScholarlyCommons@Penn

Formal verification of SSA-based optimizations for LLVM

Author: Aycock J.
Jianzhou Zhao
Milo M.K. Martin
Muchnick S. S.
Santosh Nagarakatte
Steve Zdancewic
Yakobowski B.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Improving Multiple-CMP Systems Using Token Coherence

Author: Bingham Jesse D
Hill Mark D
Hu Alan J
Martin Milo M.K.
Marty Michael R
Wood David A
Publication venue: ScholarlyCommons
Publication date: 12/02/2005
Field of study

Improvements in semiconductor technology now enable Chip Multiprocessors (CMPs). As many future computer systems will use one or more CMPs and support shared memory, such systems will have caches that must be kept coherent. Coherence is a particular challenge for Multiple-CMP (M-CMP) systems. One approach is to use a hierarchical protocol that explicitly separates the intra-CMP coherence protocol from the inter-CMP protocol, but couples them hierarchically to maintain coherence. However, hierarchical protocols are complex, leading to subtle, difficult-to-verify race conditions. Furthermore, most previous hierarchical protocols use directories at one or both levels, incurring indirections—and thus extra latency—for sharing misses, which are common in commercial workloads. In contrast, this paper exploits the separation of correctness substrate and performance policy in the recently-proposed token coherence protocol to develop the first M-CMP coherence protocol that is flat for correctness, but hierarchical for performance. Via model checking studies, we show that flat correctness eases verification. Via simulation with micro-benchmarks, we make new protocol variants more robust under contention. Finally, via simulation with commercial workloads on a commercial operating system, we show that new protocol variants can be 10-50% faster than a hierarchical directory protocol

ScholarlyCommons@Penn

Multicore acceleration of priority-based schedulers for concurrency bug detection

Author: Bienia C.
Clarke E.
Jalbert N.
Jula H.
Madanlal Musuvathi
Milo M.K. Martin
Musuvathi M.
Santosh Nagarakatte
Sebastian Burckhardt
Xiong W.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Computational sprinting on a hardware/software testbed

Author: Arun Raghavan
Chakraborty K.
Girod B.
Kevin P. Pipe
Laurel Emurian
Lei Shao
Li J.
Loudon G.
Marios Papaefthymiou
Merritt R.
Milo M.K. Martin
Rotem E.
Thomas F. Wenisch
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

SafetyNet: Improving the availability of shared memory multiprocessors with global checkpoint/recovery

Author: Hill Mark D.
Martin Milo M.K.
Sorin Daniel J.
Wood David A.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2001
Field of study

This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder

CiteSeerX

Crossref

Minds@University of Wisconsin

ScholarlyCommons@Penn