9 research outputs found

    Reducing Host Load, Network Load and Latency in a Distributed Shared Memory

    Get PDF
    Mether is a Distributed Shared Memory (DSM) that runs on Sun¹ workstations under the SunOS 4.0 operating system. User programs access the Mether address space in a way indistinguishable from other memory. Mether had a number of performance problems which we had also seen on a distributed shared memory called Memnet[2]. In this paper we discuss changes we made to Mether and protocols we developed to use Mether that minimize host load, network load, and latency. An interesting (and unexpected) result was that for one problem we studied the same best protocol for Mether is identical to the best protocol for MemNet[6]. The changes to Mether involve exposing an inconsistent store to the application and making access to the consistent and inconsistent versions very convenient; providing both demand-driven and data-driven semantics for updating pages; and allowing the user to specify that only a small subset of a page need be transferred. All of these operations are encoded in a few address bits in the Mether virtual address

    Traffic Characteristics of a Distributed Memory System

    Get PDF
    We believe that many distributed computing systems of the future will use distributed shared memory as a technique for interprocess communication. Thus, traffic generated by memory requests will be a major component of the traffic for any networks which connect nodes in such a system. In this paper, we study memory reference strings gathered with a tracing program we devised. We study several models. First, we look at raw reference data, as would be seen if the network were a backplane. Second, we examine references in units of blocks , first using a one-block cache model and then with an infinite cache. Finally, we study the effect of predictive prepaging of these blocks on the traffic. We provide a novel representation of memory reference data which can be used to calculate interarrival distributions directly. Integrating communication with computation can be used to control both traffic and performance

    Architecture and Performance of the Mether Network Shared Memory

    Get PDF
    Mether is a Network Shared Memory (NSM). It allows applications on autonomous computers connected by a network to share a segment of memory. NSMs offer the attraction of a simple abstraction for shared state, i.e., shared memory. NSMs have a potential performance problem in the cost of remote references, which is typically solved by grouping memory into larger units such as pages, and caching pages. While Mether employs grouping and caching to reduce the average memory reference delay, it also removes the need for many remote references (page faults) by providing a facility with relaxed consistency requirements. Applications ported from a multiprocessor supercomputer with shared memory to a 16-workstation Mether configuration showed a cost/performance advantage of over 300 in favor of the Mether system. While Mether is currently implemented for Sun-3 and Sun-4 systems connected via Ethernet, other characteristics (such as a choice of page sizes and a semaphore-like access mode useful for process synchronization) should suit it to a wide variety of networks. A reimplementation for an alternate configuration employing packet-switched networks is in progress

    Parallel processing for scientific computations

    Get PDF
    The scope of this project dealt with the investigation of the requirements to support distributed computing of scientific computations over a cluster of cooperative workstations. Various experiments on computations for the solution of simultaneous linear equations were performed in the early phase of the project to gain experience in the general nature and requirements of scientific applications. A specification of a distributed integrated computing environment, DICE, based on a distributed shared memory communication paradigm has been developed and evaluated. The distributed shared memory model facilitates porting existing parallel algorithms that have been designed for shared memory multiprocessor systems to the new environment. The potential of this new environment is to provide supercomputing capability through the utilization of the aggregate power of workstations cooperating in a cluster interconnected via a local area network. Workstations, generally, do not have the computing power to tackle complex scientific applications, making them primarily useful for visualization, data reduction, and filtering as far as complex scientific applications are concerned. There is a tremendous amount of computing power that is left unused in a network of workstations. Very often a workstation is simply sitting idle on a desk. A set of tools can be developed to take advantage of this potential computing power to create a platform suitable for large scientific computations. The integration of several workstations into a logical cluster of distributed, cooperative, computing stations presents an alternative to shared memory multiprocessor systems. In this project we designed and evaluated such a system

    Orca: A Language for Parallel Programming of Distributed Systems

    Get PDF
    Orca is a language for implementing parallel applications on loosely coupled distributed systems. Unlike most languages for distributed programming, it allows processes on different machines to share data. Such data are encapsulated in data-objects, which are instances of user-defined abstract data types. The implementation of Orca takes care of the physical distribution of objects among the local memories of the processors. In particular, an implementation may replicate and/or migrate objects in order to decrease access times to objects and increase parallelism. This paper gives a detailed description of the Orca language design and motivates the design choices. Orca is intended for applications programmers rather than systems programmers. This is reflected in its design goals to provide a simple, easy to use language that is type-secure and provides clean semantics. The paper discusses three example parallel applications in Orca, one of which is described in detail. It also describes..

    Innovation in the Wireless Ecosystem: A Customer-Centric Framework

    Get PDF
    The Federal Communications Commission’s Notice of Inquiry in GN 09-157 Fostering Innovation and Investment in the Wireless Communications Market is a significant event at an opportune moment. Wireless communications has already radically changed the way not only Americans but people the world over communicate with each other and access and share information, and there appears no end in sight to this fundamental shift in communication markets. Although the wireless communications phenomenon is global, the US has played and will continue to play a major role in the shaping of this market. At the start of a new US Administration and important changes in the FCC, it is most appropriate that this proceeding be launched.

    Mermera: Non-Coherent Distributed Shared Memory for Parallel Computing

    Full text link
    The proliferation of inexpensive workstations and networks has prompted several researchers to use such distributed systems for parallel computing. Attempts have been made to offer a shared-memory programming model on such distributed memory computers. Most systems provide a shared-memory that is coherent in that all processes that use it agree on the order of all memory events. This dissertation explores the possibility of a significant improvement in the performance of some applications when they use non-coherent memory. First, a new formal model to describe existing non-coherent memories is developed. I use this model to prove that certain problems can be solved using asynchronous iterative algorithms on shared-memory in which the coherence constraints are substantially relaxed. In the course of the development of the model I discovered a new type of non-coherent behavior called Local Consistency. Second, a programming model, Mermera, is proposed. It provides programmers with a choice of hierarchically related non-coherent behaviors along with one coherent behavior. Thus, one can trade-off the ease of programming with coherent memory for improved performance with non-coherent memory. As an example, I present a program to solve a linear system of equations using an asynchronous iterative algorithm. This program uses all the behaviors offered by Mermera. Third, I describe the implementation of Mermera on a BBN Butterfly TC2000 and on a network of workstations. The performance of a version of the equation solving program that uses all the behaviors of Mermera is compared with that of a version that uses coherent behavior only. For a system of 1000 equations the former exhibits at least a 5-fold improvement in convergence time over the latter. The version using coherent behavior only does not benefit from employing more than one workstation to solve the problem while the program using non-coherent behavior continues to achieve improved performance as the number of workstations is increased from 1 to 6. This measurement corroborates our belief that non-coherent shared memory can be a performance boon for some applications

    Reducing Host Load, Network Load, and Latency in a Distributed Shared Memory

    No full text
    Mether is a Distributed Shared Memory (DSM) that runs on Sun 1 workstations under the SunOS 4.0 operating system. User programs access the Mether address space in a way indistinguishable from other memory. Mether had a number of performance problems which we had also seen on a distributed shared memory called MemNet[2]. In this paper we discuss changes we made to Mether and protocols we developed to use Mether that minimize host load, network load, and latency. An interesting (and unexpected) result was that for one problem we studied the same "best" protocol for Mether is identical to the "best" protocol for MemNet[6]. The changes to Mether involve exposing an inconsistent store to the application and making access to the consistent and inconsistent versions very convenient; providing both demanddriven and data-driven semantics for updating pages; and allowing the user to specify that only a small subset of a page need be transferred. All of these operations are encoded in a few ad..
    corecore