Search CORE

10,260 research outputs found

Parallel discrete event simulation: A shared memory approach

Author: Malony Allen D.
Mccredie Bradley D.
Reed Daniel A.
Publication venue
Publication date
Field of study

With traditional event list techniques, evaluating a detailed discrete event simulation model can often require hours or even days of computation time. Parallel simulation mimics the interacting servers and queues of a real system by assigning each simulated entity to a processor. By eliminating the event list and maintaining only sufficient synchronization to insure causality, parallel simulation can potentially provide speedups that are linear in the number of processors. A set of shared memory experiments is presented using the Chandy-Misra distributed simulation algorithm to simulate networks of queues. Parameters include queueing network topology and routing probabilities, number of processors, and assignment of network nodes to processors. These experiments show that Chandy-Misra distributed simulation is a questionable alternative to sequential simulation of most queueing network models

Deterministic Consistency: A Programming Model for Shared Memory Parallelism

Author: Aviram Amittai
Ford Bryan
Publication venue
Publication date: 01/01/2010
Field of study

The difficulty of developing reliable parallel software is generating interest in deterministic environments, where a given program and input can yield only one possible result. Languages or type systems can enforce determinism in new code, and runtime systems can impose synthetic schedules on legacy parallel code. To parallelize existing serial code, however, we would like a programming model that is naturally deterministic without language restrictions or artificial scheduling. We propose "deterministic consistency", a parallel programming model as easy to understand as the "parallel assignment" construct in sequential languages such as Perl and JavaScript, where concurrent threads always read their inputs before writing shared outputs. DC supports common data- and task-parallel synchronization abstractions such as fork/join and barriers, as well as non-hierarchical structures such as producer/consumer pipelines and futures. A preliminary prototype suggests that software-only implementations of DC can run applications written for popular parallel environments such as OpenMP with low (<10%) overhead for some applications.Comment: 7 pages, 3 figure

arXiv.org e-Print Archive

CiteSeerX

Network calculus for parallel processing

Author: Kamarava S.
Kesidis G.
Liebeherr J.
Shan Y.
Urgaonkar B.
Publication venue
Publication date: 31/01/2015
Field of study

In this note, we present preliminary results on the use of "network calculus" for parallel processing systems, specifically MapReduce

arXiv.org e-Print Archive

CiteSeerX

CCL: a portable and tunable collective communication library for scalable parallel computers

Author: Alex Ho
Ching-tien Ho
Jehoshua Bruck
Marc Snir
Pablo Elustondo
Robert Cypher
Senior Member
Senior Member
Shlomo Kipnis
Vasanth Bala
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1995
Field of study

A collective communication library for parallel computers includes frequently used operations such as broadcast, reduce, scatter, gather, concatenate, synchronize, and shift. Such a library provides users with a convenient programming interface, efficient communication operations, and the advantage of portability. A library of this nature, the Collective Communication Library (CCL), intended for the line of scalable parallel computer products by IBM, has been designed. CCL is part of the parallel application programming interface of the recently announced IBM 9076 Scalable POWERparallel System 1 (SP1). In this paper, we examine several issues related to the functionality, correctness, and performance of a portable collective communication library while focusing on three novel aspects in the design and implementation of CCL: 1) the introduction of process groups, 2) the definition of semantics that ensures correctness, and 3) the design of new and tunable algorithms based on a realistic point-to-point communication model

CiteSeerX

Caltech Authors

Requirements for implementing real-time control functional modules on a hierarchical parallel pipelined system

Author: Lumia Ronald
Michaloski John L.
Wheatley Thomas E.
Publication venue
Publication date
Field of study

Analysis of a robot control system leads to a broad range of processing requirements. One fundamental requirement of a robot control system is the necessity of a microcomputer system in order to provide sufficient processing capability.The use of multiple processors in a parallel architecture is beneficial for a number of reasons, including better cost performance, modular growth, increased reliability through replication, and flexibility for testing alternate control strategies via different partitioning. A survey of the progression from low level control synchronizing primitives to higher level communication tools is presented. The system communication and control mechanisms of existing robot control systems are compared to the hierarchical control model. The impact of this design methodology on the current robot control systems is explored