287 research outputs found

    Necessary and sufficient conditions for 1-adaptivity

    Full text link

    Global synchronization algorithms for the Intel iPSC/860

    Get PDF
    In a distributed memory multicomputer that has no global clock, global processor synchronization can only be achieved through software. Global synchronization algorithms are used in tridiagonal systems solvers, CFD codes, sequence comparison algorithms, and sorting algorithms. They are also useful for event simulation, debugging, and for solving mutual exclusion problems. For the Intel iPSC/860 in particular, global synchronization can be used to ensure the most effective use of the communication network for operations such as the shift, where each processor in a one-dimensional array or ring concurrently sends a message to its right (or left) neighbor. Three global synchronization algorithms are considered for the iPSC/860: the gysnc() primitive provided by Intel, the PICL primitive sync0(), and a new recursive doubling synchronization (RDS) algorithm. The performance of these algorithms is compared to the performance predicted by communication models of both the long and forced message protocols. Measurements of the cost of shift operations preceded by global synchronization show that the RDS algorithm always synchronizes the nodes more precisely and costs only slightly more than the other two algorithms

    A message passing kernel for the hypercluster parallel processing test bed

    Get PDF
    A Message-Passing Kernel (MPK) for the Hypercluster parallel-processing test bed is described. The Hypercluster is being developed at the NASA Lewis Research Center to support investigations of parallel algorithms and architectures for computational fluid and structural mechanics applications. The Hypercluster resembles the hypercube architecture except that each node consists of multiple processors communicating through shared memory. The MPK efficiently routes information through the Hypercluster, using a message-passing protocol when necessary and faster shared-memory communication whenever possible. The MPK also interfaces all of the processors with the Hypercluster operating system (HYCLOPS), which runs on a Front-End Processor (FEP). This approach distributes many of the I/O tasks to the Hypercluster processors and eliminates the need for a separate I/O support program on the FEP

    Almost optimal asynchronous rendezvous in infinite multidimensional grids

    Get PDF
    Two anonymous mobile agents (robots) moving in an asynchronous manner have to meet in an infinite grid of dimension δ> 0, starting from two arbitrary positions at distance at most d. Since the problem is clearly infeasible in such general setting, we assume that the grid is embedded in a δ-dimensional Euclidean space and that each agent knows the Cartesian coordinates of its own initial position (but not the one of the other agent). We design an algorithm permitting the agents to meet after traversing a trajectory of length O(d δ polylog d). This bound for the case of 2d-grids subsumes the main result of [12]. The algorithm is almost optimal, since the Ω(d δ) lower bound is straightforward. Further, we apply our rendezvous method to the following network design problem. The ports of the δ-dimensional grid have to be set such that two anonymous agents starting at distance at most d from each other will always meet, moving in an asynchronous manner, after traversing a O(d δ polylog d) length trajectory. We can also apply our method to a version of the geometric rendezvous problem. Two anonymous agents move asynchronously in the δ-dimensional Euclidean space. The agents have the radii of visibility of r1 and r2, respectively. Each agent knows only its own initial position and its own radius of visibility. The agents meet when one agent is visible to the other one. We propose an algorithm designing the trajectory of each agent, so that they always meet after traveling a total distance of O( ( d)), where r = min(r1, r2) and for r ≥ 1. r)δpolylog ( d r

    Computing in the RAIN: a reliable array of independent nodes

    Get PDF
    The RAIN project is a research collaboration between Caltech and NASA-JPL on distributed computing and data-storage systems for future spaceborne missions. The goal of the project is to identify and develop key building blocks for reliable distributed systems built with inexpensive off-the-shelf components. The RAIN platform consists of a heterogeneous cluster of computing and/or storage nodes connected via multiple interfaces to networks configured in fault-tolerant topologies. The RAIN software components run in conjunction with operating system services and standard network protocols. Through software-implemented fault tolerance, the system tolerates multiple node, link, and switch failures, with no single point of failure. The RAIN-technology has been transferred to Rainfinity, a start-up company focusing on creating clustered solutions for improving the performance and availability of Internet data centers. In this paper, we describe the following contributions: 1) fault-tolerant interconnect topologies and communication protocols providing consistent error reporting of link failures, 2) fault management techniques based on group membership, and 3) data storage schemes based on computationally efficient error-control codes. We present several proof-of-concept applications: a highly-available video server, a highly-available Web server, and a distributed checkpointing system. Also, we describe a commercial product, Rainwall, built with the RAIN technology

    A Distributed algorithm to find Hamiltonian cycles in Gnp random graphs

    Get PDF
    In this paper, we present a distributed algorithm to find Hamiltonian cycles in random binomial graphs Gnp. The algorithm works on a synchronous distributed setting by first creating a small cycle, then covering almost all vertices in the graph with several disjoint paths, and finally patching these paths and the uncovered vertices to the cycle. Our analysis shows that, with high probability, our algorithm is able to find a Hamiltonian cycle in Gnp when p_n=omega(sqrt{log n}/n^{1/4}). Moreover, we conduct an average case complexity analysis that shows that our algorithm terminates in expected sub-linear time, namely in O(n^{3/4+epsilon}) pulses.Postprint (published version

    Orca: A Language for Parallel Programming of Distributed Systems

    Get PDF
    Orca is a language for implementing parallel applications on loosely coupled distributed systems. Unlike most languages for distributed programming, it allows processes on different machines to share data. Such data are encapsulated in data-objects, which are instances of user-defined abstract data types. The implementation of Orca takes care of the physical distribution of objects among the local memories of the processors. In particular, an implementation may replicate and/or migrate objects in order to decrease access times to objects and increase parallelism. This paper gives a detailed description of the Orca language design and motivates the design choices. Orca is intended for applications programmers rather than systems programmers. This is reflected in its design goals to provide a simple, easy to use language that is type-secure and provides clean semantics. The paper discusses three example parallel applications in Orca, one of which is described in detail. It also describes..

    Parallel discrete event simulation: A shared memory approach

    Get PDF
    With traditional event list techniques, evaluating a detailed discrete event simulation model can often require hours or even days of computation time. Parallel simulation mimics the interacting servers and queues of a real system by assigning each simulated entity to a processor. By eliminating the event list and maintaining only sufficient synchronization to insure causality, parallel simulation can potentially provide speedups that are linear in the number of processors. A set of shared memory experiments is presented using the Chandy-Misra distributed simulation algorithm to simulate networks of queues. Parameters include queueing network topology and routing probabilities, number of processors, and assignment of network nodes to processors. These experiments show that Chandy-Misra distributed simulation is a questionable alternative to sequential simulation of most queueing network models
    corecore