250,905 research outputs found
Distributed Programming with Shared Data
Until recently, at least one thing was clear about parallel programming: tightly coupled (shared memory) machines were programmed in a language based on shared variables and loosely coupled (distributed) systems were programmed using message passing. The explosive growth of research on distributed systems and their languages, however, has led to several new methodologies that blur this simple distinction. Operating system primitives (e.g., problem-oriented shared memory, Shared Virtual Memory, the Agora shared memory) and languages (e.g., Concurrent Prolog, Linda, Emerald) for programming distributed systems have been proposed that support the shared variable paradigm without the presence of physical shared memory. In this paper we will look at the reasons for this evolution, the resemblances and differences among these new proposals, and the key issues in their design and implementation. It turns out that many implementations are based on replication of data. We take this idea one step further, and discuss how automatic replication (initiated by the run time system) can be used as a basis for a new model, called the shared data-object model, whose semantics are similar to the shared variable model. Finally, we discuss the design of a new language for distributed programming, Orca, based on the shared data-object model. 1
Memory sharing for interactive ray tracing on clusters
ManuscriptWe present recent results in the application of distributed shared memory to image parallel ray tracing on clusters. Image parallel rendering is traditionally limited to scenes that are small enough to be replicated in the memory of each node, because any processor may require access to any piece of the scene. We solve this problem by making all of a cluster's memory available through software distributed shared memory layers. With gigabit ethernet connections, this mechanism is sufficiently fast for interactive rendering of multi-gigabyte datasets. Object- and page-based distributed shared memories are compared, and optimizations for efficient memory use are discussed
A Comparison of Two Paradigms for Distributed Shared Memory
This paper compares two paradigms for Distributed Shared Memory on loosely coupled computing systems: the shared data-object model as used in Orca, a programming language specially designed for loosely coupled computing systems and the Shared Virtual Memory model. For both paradigms two systems are described, one using only point-to-point messages, the other using broadcasting as well. The two paradigms and their implementations are described briefly. Their performances on four applications are compared: the travelling-salesman problem, alpha-beta search, matrix multiplication and the all-pairs shortest paths problem. The relevant measurements were obtained on a system consisting of 10 MC68020 processors connected by an Ethernet. For comparison purposes, the applications have also been run on a system with physical shared memory. In addition, the paper gives measurements for the first two applications above when Remote Procedure Call is used as the communication mechanism. The measurements show that both paradigms can be used efficiently for programming large-grain parallel applications, with significant speed-ups. The structured shared data-object model achieves the highest speed-ups and is easiest to program and to debug. KEYWORDS: Amoeba Distributed shared memory Distributed programming Orc
Strong Equivalence Relations for Iterated Models
The Iterated Immediate Snapshot model (IIS), due to its elegant geometrical
representation, has become standard for applying topological reasoning to
distributed computing. Its modular structure makes it easier to analyze than
the more realistic (non-iterated) read-write Atomic-Snapshot memory model (AS).
It is known that AS and IIS are equivalent with respect to \emph{wait-free
task} computability: a distributed task is solvable in AS if and only if it
solvable in IIS. We observe, however, that this equivalence is not sufficient
in order to explore solvability of tasks in \emph{sub-models} of AS (i.e.
proper subsets of its runs) or computability of \emph{long-lived} objects, and
a stronger equivalence relation is needed. In this paper, we consider
\emph{adversarial} sub-models of AS and IIS specified by the sets of processes
that can be \emph{correct} in a model run. We show that AS and IIS are
equivalent in a strong way: a (possibly long-lived) object is implementable in
AS under a given adversary if and only if it is implementable in IIS under the
same adversary. %This holds whether the object is one-shot or long-lived.
Therefore, the computability of any object in shared memory under an
adversarial AS scheduler can be equivalently investigated in IIS
Convex Hull Formation for Programmable Matter
We envision programmable matter as a system of nano-scale agents (called
particles) with very limited computational capabilities that move and compute
collectively to achieve a desired goal. We use the geometric amoebot model as
our computational framework, which assumes particles move on the triangular
lattice. Motivated by the problem of sealing an object using minimal resources,
we show how a particle system can self-organize to form an object's convex
hull. We give a distributed, local algorithm for convex hull formation and
prove that it runs in asynchronous rounds, where is the
length of the object's boundary. Within the same asymptotic runtime, this
algorithm can be extended to also form the object's (weak) -hull,
which uses the same number of particles but minimizes the area enclosed by the
hull. Our algorithms are the first to compute convex hulls with distributed
entities that have strictly local sensing, constant-size memory, and no shared
sense of orientation or coordinates. Ours is also the first distributed
approach to computing restricted-orientation convex hulls. This approach
involves coordinating particles as distributed memory; thus, as a supporting
but independent result, we present and analyze an algorithm for organizing
particles with constant-size memory as distributed binary counters that
efficiently support increments, decrements, and zero-tests --- even as the
particles move
Munin: Distributed Shared Memory Based on Type-Specific Memory Coherence
We are developing Munin y , a system that allows pro grams written for shared memory multiprocessors to be executed efficiently on distributed memory machines. Thus, Munin overcomes the architectural limitations of shared memory machines, while maintaining their advantages in terms of ease of programming. A unique characteristic of Munin is the mechanism by which the shared memory programming model is translated to the distributed memory hardware. This translation is performed by runtime software, with the aid of semantic hints provided by the user. Each shared data object is supported by a memory coherence mechanism appropriate to the manner in which the object is accessed. This paper focuses on Munin's memory coherence mechanisms, and compares our approach to previous work in this area. This research was supported in part by the National Science Foundationunder Grants CCR8716914 and DCA8619893 and by a National Science Foundation Fellowship. y In Norse mythology, the ravens Munin (Memory) and Hugin (Thought) perched on Odin's shoulder, and each evening they flew across the world to bring Odin knowledge of man's memories and thoughts. Thus, the raven Munin can be considered to have been the first distributed shared memory mechanism
Supporting persistent C++ objects in a distributed storage system
technical reportWe have designed and implemented a C++ object layer for Khazana, a distributed persistent storage system that exports a flat shared address space as its basic abstraction. The C++ layer described herein lets programmers use familiar C++ idioms to allocate, manipulate, and deallocate persistent shared data structures. It handles the tedious details involved in accessing this shared data, replicating it, maintaining consistency, converting data representations between persistent and in-memory representations, associating type information including methods with objects, etc. To support the C++ object layer on top of Khazana's flat storage abstraction, we have developed a language-specific preprocessor that generates support code to manage the user-specified persistent C++ structures. We describe the design of the C++ object layer and the compiler and runtime mechanisms needed to support it
CoR's Faster Route over Myrinet
In this paper we concentrate in the efforts made to exploit the performance of Myrinet to build a faster communication route into CoR1. By accessing the Myrinet interface through GM2, we achieved low latency and high bandwidth message passing without the overhead of a higher level protocol stack, system calls or interrupts. CoR is an ongoing project unique in its design goal of combining multithreading, message passing and distributed shared memory with facilities to dynamically select from different transport media and protocols the one that best fits communication and interaction requirements. The ability to mix CoR and PVM calls in the same program brings numerous benefits to the application developer familiar with PVM, notably: 1) new transport communication layers; PvmRouteMyrinet and PvmRouteUdp; 2) migration mechanisms for exploiting fine grain message passing; 3) thread-safe communication PVM API; 4) object-oriented distributed shared memory
- …