6 research outputs found

    List Ranking on PC Clusters

    Get PDF
    We present two algorithms for the List Ranking Problem in the Coarse Grained Multicomputer model (CGM for short): if pp is the number of processors and nn the size of the list, then we give a deterministic one that achieves O(logplogp)O(\log p \log^* p) communication rounds and O(nlogp)O(n \log^* p) for the required communication cost and total computation time; and a randomized one that requires O(logp)O(\log p) communication rounds and O(n)O(n) for the required communication cost and total computation time. We report on experimental studies of these algorithms on a PC cluster interconnected by a Myrinet network. As far as we know, it is the first portable code on this problem that runs on a cluster. With these experimental studies, we study the validity of the chosen CGM-model, and also show the possible gains and limits of such algorithms for PC clusters

    Feasability, Portability, Predictability and Efficiency : Four Ambitious Goals for the Design and Implementation of Parallel Coarse Grained Graph Algorithms

    Get PDF
    We study the relationship between the design and analysis of graph algorithms in the coarsed grained parallel models and the behavior of the resulting code on todays parallel machines and clusters. We conclude that the coarse grained multicomputer model (CGM) is well suited to design competitive algorithms, and that it is thereby now possible to aim to develop portable, predictable and efficient parallel algorithms code for graph problems

    Randomized Parallel List Ranking For Distributed Memory Multiprocessors

    No full text
    We present a randomized parallel list ranking algorithm for distributed memory multiprocessors, using a BSP like model. We first describe a simple version which requires, with high probability, log(3p) + log ln(n) = ~ O(logp+ log log n) communication rounds (h-relations with h = ~ O( n p )) and ~ O( n p ) local computation. We then outline an improved version which requires, with high probability, only r (4k + 6) log( 2 3 p) + 8 = ~ O(k log p) communication rounds where k = minfi 0j ln (i+1) n ( 2 3 p) 2i+1 g. Note that k ! ln (n) is an extremely small number. For n 10 10 100 and p 4, the value of k is at most 2. Hence, for a given number of processors, p, the number of communication rounds required is, for all practical purposes, independent of n. For n 1; 500; 000 and 4 p 2048, the number of communication rounds in our algorithm is bounded, with high probability, by 78, but the actual number of communication rounds observed so far is 25 in the worst case. Fo..

    Randomized Parallel List Ranking For Distributed Memory Multiprocessors

    No full text
    . We present a randomized parallel list ranking algorithm for distributed memory multiprocessors. A simple version requires, with high probability, log(3p) + log ln(n) = ~ O(log p + log log n) communication rounds (h-relations with h = ~ O( n p )) and ~ O( n p ) local computation. An improved version requires, with high probability, only r (4k + 6) log( 2 3 p) + 8 = ~ O(k log p) communication rounds where k = minfi 0j ln (i+1) n ( 2 3 p) 2i+1 g. Note that k ! ln (n) is an extremely small number. For n 10 10 100 and p 4, the value of k is at most 2. For a given number of processors, p, the number of communication rounds required is, for all practical purposes, independent of n. For n 10 10 100 and 4 p 2048, the number of communication rounds in our algorithm is bounded, with high probability, by 118. We conjecture that the actual number of communications rounds will not exceed 50. 1 Introduction The Model Speedup results for theoretical PRAM algorithms do not n..
    corecore