4,400 research outputs found

    Towards compliant distributed shared memory

    Get PDF
    Copyright © 2002 IEEEThere exists a wide spectrum of coherency models for use in distributed shared memory (DSM) systems. The choice of model for an application should ideally be based on the application's data access patterns and phase changes. However, in current systems, most, if not all of the parameters of the coherency model are fixed in the underlying DSM system. This forces the application either to structure its computations to suit the underlying model or to endure an inefficient coherency model. This paper introduces a unique approach to the provision of DSM based on the idea of compliance. Compliance allows an application to specify how the system should most effectively operate through a separation between mechanism, provided by the underlying system, and policy, pro-vided by the application. This is in direct contrast with the traditional view that an application must mold itself to the hard-wired choices that its operating platform has made. The contribution of this work is the definition and implementation of an architecture for compliant distributed coherency management. The efficacy of this architecture is illustrated through a worked example.Falkner, K. E.; Detmold, H.; Munro, D. S.; Olds, T

    Distributed shared memory for virtual environments

    Get PDF
    Bibliography: leaves 71-77.This work investigated making virtual environments easier to program, by designing a suitable distributed shared memory system. To be usable, the system must keep latency to a minimum, as virtual environments are very sensitive to it. The resulting design is push-based and non-consistent. Another requirement is that the system should be scaleable, over large distances and over large numbers of participants. The latter is hard to achieve with current network protocols, and a proposal was made for a more scaleable multicast addressing system than is used in the Internet protocol. Two sample virtual environments were developed to test the ease-of-use of the system. This showed that the basic concept is sound, but that more support is needed. The next step should be to extend the language and add compiler support, which will enhance ease-of-use and allow numerous optimisations. This can be improved further by providing system-supported containers

    Distributed Shared Memory in a Grid Environment

    Get PDF

    Translation techniques for distributed-shared memory programming models

    Get PDF
    This thesis argues that a modular, source-to-source translation system for distributed-shared memory programming models would be beneficial to the high-performance computing community. It goes on to present a proof-of-concept example in detail, translating between Global Arrays (GA) and Unified Parallel C (UPC). Some useful extensions to UPC are discussed, along with how they are implemented in the proof-of-concept translator

    Impact of the Consistency Model on Checkpointing of Distributed Shared Memory

    Full text link
    In this report, we consider the impact of the consistency model on checkpointing and rollback algorithms for distributed shared memory. In particular, we consider specific implementations of four consistency models for distributed shared memory, namely, linearizability, sequential consistency, causal consistency and eventual consistency, and develop checkpointing and rollback algorithms that can be integrated into the implementations of the consistency models. Our results empirically demonstrate that the mechanisms used to implement stronger consistency models lead to simpler or more efficient checkpointing algorithms

    A Comparison of Two Paradigms for Distributed Shared Memory

    Get PDF
    This paper compares two paradigms for Distributed Shared Memory on loosely coupled computing systems: the shared data-object model as used in Orca, a programming language specially designed for loosely coupled computing systems and the Shared Virtual Memory model. For both paradigms two systems are described, one using only point-to-point messages, the other using broadcasting as well. The two paradigms and their implementations are described briefly. Their performances on four applications are compared: the travelling-salesman problem, alpha-beta search, matrix multiplication and the all-pairs shortest paths problem. The relevant measurements were obtained on a system consisting of 10 MC68020 processors connected by an Ethernet. For comparison purposes, the applications have also been run on a system with physical shared memory. In addition, the paper gives measurements for the first two applications above when Remote Procedure Call is used as the communication mechanism. The measurements show that both paradigms can be used efficiently for programming large-grain parallel applications, with significant speed-ups. The structured shared data-object model achieves the highest speed-ups and is easiest to program and to debug. KEYWORDS: Amoeba Distributed shared memory Distributed programming Orc

    Brief announcement: Distributed shared memory based on computation migration

    Get PDF
    Driven by increasingly unbalanced technology scaling and power dissipation limits, microprocessor designers have resorted to increasing the number of cores on a single chip, and pundits expect 1000-core designs to materialize in the next few years [1]. But how will memory architectures scale and how will these next-generation multicores be programmed? One barrier to scaling current memory architectures is the offchip memory bandwidth wall [1,2]: off-chip bandwidth grows with package pin density, which scales much more slowly than on-die transistor density [3]. To reduce reliance on external memories and keep data on-chip, today’s multicores integrate very large shared last-level caches on chip [4]; interconnects used with such shared caches, however, do not scale beyond relatively few cores, and the power requirements and access latencies of large caches exclude their use in chips on a 1000-core scale. For massive-scale multicores, then, we are left with relatively small per-core caches. Per-core caches on a 1000-core scale, in turn, raise the question of memory coherence. On the one hand, a shared memory abstraction is a practical necessity for general-purpose programming, and most programmers prefer a shared memory model [5]. On the other hand, ensuring coherence among private caches is an expensive proposition: bus-based and snoopy protocols don’t scale beyond relatively few cores, and directory sizes needed in cache-coherence protocols must equal a significant portion of the combined size of the per-core caches as otherwise directory evictions will limit performance [6]. Moreover, directory-based coherence protocols are notoriously difficult to implement and verify [7]
    corecore