Search CORE

289 research outputs found

An MPA-IO interface to HPSS

Author: Jones Terry
Mark Richard
Martin Jeanne
May John
Pierce Elsie
Stanberry Linda
Publication venue
Publication date
Field of study

This paper describes an implementation of the proposed MPI-IO (Message Passing Interface - Input/Output) standard for parallel I/O. Our system uses third-party transfer to move data over an external network between the processors where it is used and the I/O devices where it resides. Data travels directly from source to destination, without the need for shuffling it among processors or funneling it through a central node. Our distributed server model lets multiple compute nodes share the burden of coordinating data transfers. The system is built on the High Performance Storage System (HPSS), and a prototype version runs on a Meiko CS-2 parallel computer

NASA Technical Reports Server

Message-passing performance of various computers

Author: Jack J. Dongarra
Tom Dunigan
Publication venue: 'Wiley'
Publication date: 01/01/2005
Field of study

Crossref

Message‐passing performance of various computers

Author: Jack J. Dongarra
Tom Dunigan
Publication venue: 'Wiley'
Publication date: 01/01/2002
Field of study

Crossref

Communication overhead on the Intel Paragon, IBM SP2 and Meiko CS-2

Author: Bokhari Shahid H.
Publication venue
Publication date
Field of study

Interprocessor communication overhead is a crucial measure of the power of parallel computing systems-its impact can severely limit the performance of parallel programs. This report presents measurements of communication overhead on three contemporary commercial multicomputer systems: the Intel Paragon, the IBM SP2 and the Meiko CS-2. In each case the time to communicate between processors is presented as a function of message length. The time for global synchronization and memory access is discussed. The performance of these machines in emulating hypercubes and executing random pairwise exchanges is also investigated. It is shown that the interprocessor communication time depends heavily on the specific communication pattern required. These observations contradict the commonly held belief that communication overhead on contemporary machines is independent of the placement of tasks on processors. The information presented in this report permits the evaluation of the efficiency of parallel algorithm implementations against standard baselines

NASA Technical Reports Server

Multiphase complete exchange on Paragon, SP2 and CS-2

Author: Bokhari Shahid H.
Publication venue
Publication date
Field of study

The overhead of interprocessor communication is a major factor in limiting the performance of parallel computer systems. The complete exchange is the severest communication pattern in that it requires each processor to send a distinct message to every other processor. This pattern is at the heart of many important parallel applications. On hypercubes, multiphase complete exchange has been developed and shown to provide optimal performance over varying message sizes. Most commercial multicomputer systems do not have a hypercube interconnect. However, they use special purpose hardware and dedicated communication processors to achieve very high performance communication and can be made to emulate the hypercube quite well. Multiphase complete exchange has been implemented on three contemporary parallel architectures: the Intel Paragon, IBM SP2 and Meiko CS-2. The essential features of these machines are described and their basic interprocessor communication overheads are discussed. The performance of multiphase complete exchange is evaluated on each machine. It is shown that the theoretical ideas developed for hypercubes are also applicable in practice to these machines and that multiphase complete exchange can lead to major savings in execution time over traditional solutions

NASA Technical Reports Server

A real-time application for the CS-2

Author: Hauser R
Legrand I
Publication venue
Publication date: 01/01/1995
Field of study

CERN Document Server

Recommended from our members

Parallel computing and quantum simulations/011

Author: Alder B., LLNL
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date: 01/03/1998
Field of study

Our goal was to investigate the suitability of parallel supercomputer architectures for Quantum Monte Carlo (QMC). Because QMC allows one to study the properties of ions and electrons in a solid, it has important applications to condensed matter physics, chemistry, and materials science. research plan was to Our specific 1. Adapt quantum simulation codes which were highly optimized for vector supercomputers to run on the Intel Hypercube and Thinking Machines CM--5. 2. Identify architectural bottlenecks in communication, floating point computation, and node memory. Determine scalability with number of nodes. 3. Identify algorithmic changes required to take advantage of current and prospective architectures. We have made significant progress towards these goals. We explored implementations of the p4 parallel programming system and the Message Passing Interface (MPI) libraries to run ``world-line`` and ``determinant`` QMC and Molecular Dynamics simulations on both workstation clusters (HP, Spare, AIX, Linux) and massively parallel supercomputers (Intel iPSC1860, Meiko CS-2, BM SP-X, Intel Paragon). We addressed issues of the efficiency of parallelization as a function of distribution of the problem over the nodes and the length scale of the interactions between particles. Both choices influence he frequency of inter-node communication and the size of messages passed. We found that using the message-passing paradigm on an appropriate machine (e.g., the ntel iPSC/860) an essentially linear speedup could be obtained

UNT Digital Library