Search CORE

for the required communication cost and total computation time. We report on experimental studies of these algorithms on a PC cluster interconnected by a Myrinet network. As far as we know, it is the first portable code on this problem that runs on a cluster. With these experimental studies, we study the validity of the chosen CGM-model, and also show the possible gains and limits of such algorithms for PC clusters

CiteSeerX

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Feasability, Portability, Predictability and Efficiency : Four Ambitious Goals for the Design and Implementation of Parallel Coarse Grained Graph Algorithms

Author: Gustedt Jens
Guérin Lassous Isabelle
Morvan Michel
Publication venue: HAL CCSD
Publication date: 01/01/2000
Field of study

We study the relationship between the design and analysis of graph algorithms in the coarsed grained parallel models and the behavior of the resulting code on todays parallel machines and clusters. We conclude that the coarse grained multicomputer model (CGM) is well suited to design competitive algorithms, and that it is thereby now possible to aim to develop portable, predictable and efficient parallel algorithms code for graph problems

INRIA a CCSD electronic archive server

Randomized parallel list ranking for distributed memory multiprocessors

Author: Dehne F. (Frank)
Song S.W. (Siang W.)
Publication venue
Publication date: 01/02/1997
Field of study

Carleton University's Institutional Repository

Randomized Parallel List Ranking For Distributed Memory Multiprocessors

Author: Frank Dehne
Siang W. Song
Publication venue
Publication date: 01/01/1996
Field of study

We present a randomized parallel list ranking algorithm for distributed memory multiprocessors, using a BSP like model. We first describe a simple version which requires, with high probability, log(3p) + log ln(n) = ~ O(logp+ log log n) communication rounds (h-relations with h = ~ O( n p )) and ~ O( n p ) local computation. We then outline an improved version which requires, with high probability, only r (4k + 6) log( 2 3 p) + 8 = ~ O(k log p) communication rounds where k = minfi 0j ln (i+1) n ( 2 3 p) 2i+1 g. Note that k ! ln (n) is an extremely small number. For n 10 10 100 and p 4, the value of k is at most 2. Hence, for a given number of processors, p, the number of communication rounds required is, for all practical purposes, independent of n. For n 1; 500; 000 and 4 p 2048, the number of communication rounds in our algorithm is bounded, with high probability, by 78, but the actual number of communication rounds observed so far is 25 in the worst case. Fo..

CiteSeerX

Randomized Parallel List Ranking For Distributed Memory Multiprocessors

Author: Frank Dehne
Frank Dehne and Siang W. Song
Siang W. Song
Publication venue
Publication date
Field of study

. We present a randomized parallel list ranking algorithm for distributed memory multiprocessors. A simple version requires, with high probability, log(3p) + log ln(n) = ~ O(log p + log log n) communication rounds (h-relations with h = ~ O( n p )) and ~ O( n p ) local computation. An improved version requires, with high probability, only r (4k + 6) log( 2 3 p) + 8 = ~ O(k log p) communication rounds where k = minfi 0j ln (i+1) n ( 2 3 p) 2i+1 g. Note that k ! ln (n) is an extremely small number. For n 10 10 100 and p 4, the value of k is at most 2. For a given number of processors, p, the number of communication rounds required is, for all practical purposes, independent of n. For n 10 10 100 and 4 p 2048, the number of communication rounds in our algorithm is bounded, with high probability, by 118. We conjecture that the actual number of communications rounds will not exceed 50. 1 Introduction The Model Speedup results for theoretical PRAM algorithms do not n..

CiteSeerX