11 research outputs found

    Randomized Permutations in a Coarse Grained Parallel Environment [extended abstract]

    Get PDF
    International audienceWe show how to uniformly distribute data at random (not to be confounded with permutation routing) in a coarse grained parallel environment with p processors. In contrast to previously known work, our method is able to fulfill the three criteria of uniformity, work-optimality and balance among the processors simultaneously. To guarantee the uniformity we investigate the matrix of communication requests between the processors. We show that its distribution is a generalization of the multivariate hypergeometric distribution and we give algorithms to compute it efficiently

    Optimal (Randomized) Parallel Algorithms in the Binary-Forking Model

    Full text link
    In this paper we develop optimal algorithms in the binary-forking model for a variety of fundamental problems, including sorting, semisorting, list ranking, tree contraction, range minima, and ordered set union, intersection and difference. In the binary-forking model, tasks can only fork into two child tasks, but can do so recursively and asynchronously. The tasks share memory, supporting reads, writes and test-and-sets. Costs are measured in terms of work (total number of instructions), and span (longest dependence chain). The binary-forking model is meant to capture both algorithm performance and algorithm-design considerations on many existing multithreaded languages, which are also asynchronous and rely on binary forks either explicitly or under the covers. In contrast to the widely studied PRAM model, it does not assume arbitrary-way forks nor synchronous operations, both of which are hard to implement in modern hardware. While optimal PRAM algorithms are known for the problems studied herein, it turns out that arbitrary-way forking and strict synchronization are powerful, if unrealistic, capabilities. Natural simulations of these PRAM algorithms in the binary-forking model (i.e., implementations in existing parallel languages) incur an Ω(logn)\Omega(\log n) overhead in span. This paper explores techniques for designing optimal algorithms when limited to binary forking and assuming asynchrony. All algorithms described in this paper are the first algorithms with optimal work and span in the binary-forking model. Most of the algorithms are simple. Many are randomized

    Fast Generation of Random Permutations via Networks Simulation

    No full text
    We consider the problem of generating random permutations with the uniform distribution. That is, we require that for an arbitrary permutation of n elements, with probability 1=n! the machine halts with the ith output cell containing (i), for 1 i n. We study this problem on two models of parallel computations: the CREW PRAM and the EREW PRAM. The main result of the paper is an algorithm for generating random permutations that runs in O(log log n) time and uses O(n 1+o(1) ) processors on the CREW PRAM. This is the first o(log n)-time CREW PRAM algorithm for this problem. On the EREW PRAM we present a simple algorithm that generates a random permutation in time O(log n) using n processors and O(n) space. This algorithm outperforms each of the previously known algorithms for the exclusive write PRAMs. The common and novel feature of both our algorithms is first to design a suitable random switching network generating a permutation and then to simulate this network on..

    Fast Generation of Random Permutations via Networks Simulation

    No full text
    We consider the classical problem of generating random permutations with the uniform distribution. That is, we require that for an arbitrary permutation ß of n elements, with probability 1/n! the machine halts with the ith output cell containing ß(i), for 1 i n. We study this problem on two models of parallel computations: the CREW PRAM and the EREW PRAM. The main result of the paper is an algorithm for generating random permutations that runs in O(log log n) time and uses O(n 1+o(1) ) processors on the CREW PRAM. This is the first o(log n)-time CREW PRAM algorithm for this problem. On the EREW PRAM we present a simple algorithm that generates a random permutation in time O(log n) using n processors and O(n) space. This algorithm matches the running time and the number of processors used of the best previously known algorithms for the CREW PRAM, and performs better as far as the memory usage is considered. The common and novel feature of both our algorithms is to design first a s..

    Fast Generation of Random Permutations via Networks Simulation

    No full text
    We consider the classical problem of generating random permutations with the uniform distribution. That is, we require that for an arbitrary permutation ß of n elements, with probability 1=n! the machine halts with the ith output cell containing ß(i), for 1 i n. We study this problem on two models of parallel computations: the CREW PRAM and the EREW PRAM. The main result of the paper is an algorithm for generating random permutations that runs in O(log log n) time and uses O(n 1+o(1) ) processors on the CREW PRAM. This is the first o(log n)-time CREW PRAM algorithm for this problem. On the EREW PRAM we present a simple algorithm that generates a random permutation in time O(log n) using n processors and O(n) space. This algorithm matches the running time and the number of processors used of the best previously known algorithms for the CREW PRAM, and performs better as far as the memory usage is considered. The common and novel feature of both our algorithms is to design first a s..
    corecore