Search CORE

2,937 research outputs found

A study of the communication cost of the FFT on torus multicomputers

Author: Díaz de Cerio Ripalda Luis Manuel
González Colás Antonio María
Valero García Miguel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1995
Field of study

The computation of a one-dimensional FFT on a c-dimensional torus multicomputer is analyzed. Different approaches are proposed which differ in the way they use the interconnection network. The first approach is based on the multidimensional index mapping technique for the FFT computation. The second approach starts from a hypercube algorithm and then embeds the hypercube onto the torus. The third approach reduces the communication cost of the hypercube algorithm by pipelining the communication operations. A novel methodology to pipeline the communication operations on a torus is proposed. Analytical models are presented to compare the different approaches. This comparison study shows that the best approach depends on the number of dimensions of the torus and the communication start-up and transfer times. The analytical models allow us to select the most efficient approach for the available machine.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

On the performance of routing algorithms in wormhole-switched multicomputer networks

Author: Ould-Khaoua M.
Shahrabi A.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2005
Field of study

This paper presents a comparative performance study of adaptive and deterministic routing algorithms in wormhole-switched hypercubes and investigates the performance vicissitudes of these routing schemes under a variety of network operating conditions. Despite the previously reported results, our results show that the adaptive routing does not consistently outperform the deterministic routing even for high dimensional networks. In fact, it appears that the superiority of adaptive routing is highly dependent to the broadcast traffic rate generated at each node and it begins to deteriorate by growing the broadcast rate of generated message

Enlighten

ResearchOnline@GCU

Hyperswitch communication network

Author: Peterson J.
Pniel M.
Upchurch E.
Publication venue
Publication date
Field of study

The Hyperswitch Communication Network (HCN) is a large scale parallel computer prototype being developed at JPL. Commercial versions of the HCN computer are planned. The HCN computer being designed is a message passing multiple instruction multiple data (MIMD) computer, and offers many advantages in price-performance ratio, reliability and availability, and manufacturing over traditional uniprocessors and bus based multiprocessors. The design of the HCN operating system is a uniquely flexible environment that combines both parallel processing and distributed processing. This programming paradigm can achieve a balance among the following competing factors: performance in processing and communications, user friendliness, and fault tolerance. The prototype is being designed to accommodate a maximum of 64 state of the art microprocessors. The HCN is classified as a distributed supercomputer. The HCN system is described, and the performance/cost analysis and other competing factors within the system design are reviewed

NASA Technical Reports Server

Recommended from our members

Performance modelling of wormhole-routed hypercubes with bursty traffice and finite buffers

Author: Assi Salam
Kouvatsos Demetres D.
Ould-Khaoua M.
Publication venue
Publication date: 01/01/2005
Field of study

An open queueing network model (QNM) is proposed for wormhole-routed hypercubes with finite buffers and deterministic routing subject to a compound Poisson arrival process (CPP) with geometrically distributed batches or, equivalently, a generalised exponential (GE) interarrival time distribution. The GE/G/1/K queue and appropriate GE-type flow formulae are adopted, as cost-effective building blocks, in a queue-by-queue decomposition of the entire network. Consequently, analytic expressions for the channel holding time, buffering delay, contention blocking and mean message latency are determined. The validity of the analytic approximations is demonstrated against results obtained through simulation experiments. Moreover, it is shown that the wormholerouted hypercubes suffer progressive performance degradation with increasing traffic variability (burstiness)

Bradford Scholars

MPF: A portable message passing facility for shared memory multiprocessors

Author: Malony Allen D.
Mcguire Patrick J.
Reed Daniel A.
Publication venue
Publication date
Field of study

The design, implementation, and performance evaluation of a message passing facility (MPF) for shared memory multiprocessors are presented. The MPF is based on a message passing model conceptually similar to conversations. Participants (parallel processors) can enter or leave a conversation at any time. The message passing primitives for this model are implemented as a portable library of C function calls. The MPF is currently operational on a Sequent Balance 21000, and several parallel applications were developed and tested. Several simple benchmark programs are presented to establish interprocess communication performance for common patterns of interprocess communication. Finally, performance figures are presented for two parallel applications, linear systems solution, and iterative solution of partial differential equations

NASA Technical Reports Server

A computational group theoretic symmetry reduction package for the SPIN model checker

Author: A.D. Pierro
A.D. Pierro
A.D. Pierro
B. Aziz
C. Priami
D. Hirsch
D. Moore
D.T. Gillespie
G. Plotkin
I. Stark
J. Hillston
L. Cardelli
M. Bernardo
O.M. Herescu
P. Buchholz
R. Milner
R. Milner
S. Abramsky
S. Gilmore
W.C. Rounds
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

Symmetry reduced model checking is hindered by two problems: how to identify state space symmetry when systems are not fully symmetric, and how to determine equivalence of states during search. We present TopSpin, a fully automatic symmetry reduction package for the Spin model checker. TopSpin uses the Gap computational algebra system to effectively detect state space symmetry from the associated Promela specification, and to choose an efficient symmetry reduction strategy by classifying automorphism groups as a disjoint/wreath product of subgroups. We present encouraging experimental results for a variety of Promela examples

Crossref

Portsmouth University Research Portal (Pure)

Spiral - Imperial College Digital Repository

Enlighten

ePubs: the open archive for STFC research publications

Macquarie University ResearchOnline

Concurrent Image Processing Executive (CIPE). Volume 1: Design overview

Author: Groom Steven L.
Lee Meemong
Mazer Alan S.
Williams Winifred I.
Publication venue
Publication date
Field of study

The design and implementation of a Concurrent Image Processing Executive (CIPE), which is intended to become the support system software for a prototype high performance science analysis workstation are described. The target machine for this software is a JPL/Caltech Mark 3fp Hypercube hosted by either a MASSCOMP 5600 or a Sun-3, Sun-4 workstation; however, the design will accommodate other concurrent machines of similar architecture, i.e., local memory, multiple-instruction-multiple-data (MIMD) machines. The CIPE system provides both a multimode user interface and an applications programmer interface, and has been designed around four loosely coupled modules: user interface, host-resident executive, hypercube-resident executive, and application functions. The loose coupling between modules allows modification of a particular module without significantly affecting the other modules in the system. In order to enhance hypercube memory utilization and to allow expansion of image processing capabilities, a specialized program management method, incremental loading, was devised. To minimize data transfer between host and hypercube, a data management method which distributes, redistributes, and tracks data set information was implemented. The data management also allows data sharing among application programs. The CIPE software architecture provides a flexible environment for scientific analysis of complex remote sensing image data, such as planetary data and imaging spectrometry, utilizing state-of-the-art concurrent computation capabilities

NASA Technical Reports Server

Programming distributed memory architectures using Kali

Author: Mehrotra Piyush
Vanrosendale John
Publication venue
Publication date
Field of study

Programming nonshared memory systems is more difficult than programming shared memory systems, in part because of the relatively low level of current programming environments for such machines. A new programming environment is presented, Kali, which provides a global name space and allows direct access to remote data values. In order to retain efficiency, Kali provides a system on annotations, allowing the user to control those aspects of the program critical to performance, such as data distribution and load balancing. The primitives and constructs provided by the language is described, and some of the issues raised in translating a Kali program for execution on distributed memory systems are also discussed

NASA Technical Reports Server