18 research outputs found

    Parallel Programming Using Shared Objects and Broadcasting

    Get PDF
    The two major design approaches taken to build distributed and parallel computer systems, multiprocessing and multicomputing, are discussed. A model that combines the best properties of both multiprocessor and multicomputer systems, easy-to-build hardware, and a conceptually simple programming model is presented. Using this model, a programmer defines and invokes operations on shared objects, the runtime system handles reads and writes on these objects, and the reliable broadcast layer implements indivisible updates to objects using the sequencing protocol. The resulting system is easy to program, easy to build, and has acceptable performance on problems with a moderate grain size in which reads are much more common than writes. Orca, a procedural language whose sequential constructs are roughly similar to languages like C or Modula 2 but which also supports parallel processes and shared objects and has been used to develop applications for the prototype system, is described

    PCODE: an efficient and reliable collective communication protocol for unreliable broadcast domain

    Get PDF
    Existing programming environments for clusters are typically built on top of a point-to-point communication layer (send and receive) over local area networks (LANs) and, as a result, suffer from poor performance in the collective communication part. For example, a broadcast that is implemented using a TCP/IP protocol (which is a point-to-point protocol) over a LAN is obviously inefficient as it is not utilizing the fact that the LAN is a broadcast medium. We have observed that the main difference between a distributed computing paradigm and a message passing parallel computing paradigm is that, in a distributed environment the activity of every processor is independent while in a parallel environment the collection of the user-communication layers in the processors can be modeled as a single global program. We have formalized the requirements by defining the notion of a correct global program. This notion provides a precise specification of the interface between the transport layer and the user-communication layer. We have developed PCODE, a new communication protocol that is driven by a global program and proved its correctness. We have implemented the PCODE protocol on a collection of IBM RS/6000 workstations and on a collection of Silicon Graphics Indigo workstations, both communicating via UDP broadcast. The experimental results we obtained indicate that the performance advantage of PCODE over the current point-to-point approach (TCP) can be as high as an order of magnitude on a cluster of 16 workstations

    MPI Collective Operations over IP Multicast

    Get PDF
    Many common implementations of Message Passing Inter- face (MPI) implement collective operations over point-to-point operations. This work examines IP multicast as a framework for collective operations. IP multicast is not reliable. If a receiver is not ready when a message is sent via IP multicast, the message is lost. Two techniques for ensuring that a message is not lost due to a slow receiving process are examined. The techniques are implemented and compared experimentally over both a shared and a switched Fast Ethernet. The average performance of collective operations is improved as a function of the number of participating processes and message size for both networks

    Transparent Fault-tolerance in Parallel Orca Programs

    Get PDF
    With the advent of large-scale parallel computing systems, making parallel programs fault-tolerant becomes an important problem, because the probability of a failure increases with the number of processors. In this paper, we describe a very simple scheme for rendering a class of parallel Orca programs fault-tolerant. Also, we discuss our experience with implementing this scheme on Amoeba. Our approach works for parallel applications that are not interactive. The approach is based on making a globally consistent checkpoint from time to time and rolling back to the last checkpoint when a processor fails. Making a consistent global checkpoint is easy in Orca, because its implementation is based on reliable broadcast. The advantages of our approach are its simplicity, ease of implementation, low overhead, and transparency to the Orca programmer. 1

    Group Communication in Amoeba and its Applications

    Get PDF
    Unlike many other operating systems, Amoeba is a distributed operating system that provides group communication (i.e., one-to-many communication). We wil

    Replication Techniques for Speeding up Parallel Applications on Distributed Systems

    Get PDF
    This paper discusses the design choices involved in replicating objects and their effect on performance. Important issues are: how to maintain consistency among different copies of an object; how to implement changes to objects; and which strategy for object replication to use. We have implemented several options to determine which ones are most efficient

    A Comparison of Two Paradigms for Distributed Shared Memory

    Get PDF
    This paper compares two paradigms for Distributed Shared Memory on loosely coupled computing systems: the shared data-object model as used in Orca, a programming language specially designed for loosely coupled computing systems and the Shared Virtual Memory model. For both paradigms two systems are described, one using only point-to-point messages, the other using broadcasting as well. The two paradigms and their implementations are described briefly. Their performances on four applications are compared: the travelling-salesman problem, alpha-beta search, matrix multiplication and the all-pairs shortest paths problem. The relevant measurements were obtained on a system consisting of 10 MC68020 processors connected by an Ethernet. For comparison purposes, the applications have also been run on a system with physical shared memory. In addition, the paper gives measurements for the first two applications above when Remote Procedure Call is used as the communication mechanism. The measurements show that both paradigms can be used efficiently for programming large-grain parallel applications, with significant speed-ups. The structured shared data-object model achieves the highest speed-ups and is easiest to program and to debug. KEYWORDS: Amoeba Distributed shared memory Distributed programming Orc

    Introduction to the Literature on Programming Language Design

    Get PDF
    This is an introduction to the literature on programming language design and related topics. It is intended to cite the most important work, and to provide a place for students to start a literature search

    Introduction to the Literature On Programming Language Design

    Get PDF
    This is an introduction to the literature on programming language design and related topics. It is intended to cite the most important work, and to provide a place for students to start a literature search
    corecore