26,670 research outputs found

    The Parallel Complexity of Growth Models

    Full text link
    This paper investigates the parallel complexity of several non-equilibrium growth models. Invasion percolation, Eden growth, ballistic deposition and solid-on-solid growth are all seemingly highly sequential processes that yield self-similar or self-affine random clusters. Nonetheless, we present fast parallel randomized algorithms for generating these clusters. The running times of the algorithms scale as O(log2N)O(\log^2 N), where NN is the system size, and the number of processors required scale as a polynomial in NN. The algorithms are based on fast parallel procedures for finding minimum weight paths; they illuminate the close connection between growth models and self-avoiding paths in random environments. In addition to their potential practical value, our algorithms serve to classify these growth models as less complex than other growth models, such as diffusion-limited aggregation, for which fast parallel algorithms probably do not exist.Comment: 20 pages, latex, submitted to J. Stat. Phys., UNH-TR94-0

    On the Decoding of Polar Codes on Permuted Factor Graphs

    Full text link
    Polar codes are a channel coding scheme for the next generation of wireless communications standard (5G). The belief propagation (BP) decoder allows for parallel decoding of polar codes, making it suitable for high throughput applications. However, the error-correction performance of polar codes under BP decoding is far from the requirements of 5G. It has been shown that the error-correction performance of BP can be improved if the decoding is performed on multiple permuted factor graphs of polar codes. However, a different BP decoding scheduling is required for each factor graph permutation which results in the design of a different decoder for each permutation. Moreover, the selection of the different factor graph permutations is at random, which prevents the decoder to achieve a desirable error-correction performance with a small number of permutations. In this paper, we first show that the permutations on the factor graph can be mapped into suitable permutations on the codeword positions. As a result, we can make use of a single decoder for all the permutations. In addition, we introduce a method to construct a set of predetermined permutations which can provide the correct codeword if the decoding fails on the original permutation. We show that for the 5G polar code of length 10241024, the error-correction performance of the proposed decoder is more than 0.250.25 dB better than that of the BP decoder with the same number of random permutations at the frame error rate of 10410^{-4}

    Parallel Weighted Random Sampling

    Get PDF
    Data structures for efficient sampling from a set of weighted items are an important building block of many applications. However, few parallel solutions are known. We close many of these gaps both for shared-memory and distributed-memory machines. We give efficient, fast, and practicable algorithms for sampling single items, k items with/without replacement, permutations, subsets, and reservoirs. We also give improved sequential algorithms for alias table construction and for sampling with replacement. Experiments on shared-memory parallel machines with up to 158 threads show near linear speedups both for construction and queries

    Optimal Discrete Uniform Generation from Coin Flips, and Applications

    Full text link
    This article introduces an algorithm to draw random discrete uniform variables within a given range of size n from a source of random bits. The algorithm aims to be simple to implement and optimal both with regards to the amount of random bits consumed, and from a computational perspective---allowing for faster and more efficient Monte-Carlo simulations in computational physics and biology. I also provide a detailed analysis of the number of bits that are spent per variate, and offer some extensions and applications, in particular to the optimal random generation of permutations.Comment: first draft, 22 pages, 5 figures, C code implementation of algorith

    Pruned Bit-Reversal Permutations: Mathematical Characterization, Fast Algorithms and Architectures

    Full text link
    A mathematical characterization of serially-pruned permutations (SPPs) employed in variable-length permuters and their associated fast pruning algorithms and architectures are proposed. Permuters are used in many signal processing systems for shuffling data and in communication systems as an adjunct to coding for error correction. Typically only a small set of discrete permuter lengths are supported. Serial pruning is a simple technique to alter the length of a permutation to support a wider range of lengths, but results in a serial processing bottleneck. In this paper, parallelizing SPPs is formulated in terms of recursively computing sums involving integer floor and related functions using integer operations, in a fashion analogous to evaluating Dedekind sums. A mathematical treatment for bit-reversal permutations (BRPs) is presented, and closed-form expressions for BRP statistics are derived. It is shown that BRP sequences have weak correlation properties. A new statistic called permutation inliers that characterizes the pruning gap of pruned interleavers is proposed. Using this statistic, a recursive algorithm that computes the minimum inliers count of a pruned BR interleaver (PBRI) in logarithmic time complexity is presented. This algorithm enables parallelizing a serial PBRI algorithm by any desired parallelism factor by computing the pruning gap in lookahead rather than a serial fashion, resulting in significant reduction in interleaving latency and memory overhead. Extensions to 2-D block and stream interleavers, as well as applications to pruned fast Fourier transforms and LTE turbo interleavers, are also presented. Moreover, hardware-efficient architectures for the proposed algorithms are developed. Simulation results demonstrate 3 to 4 orders of magnitude improvement in interleaving time compared to existing approaches.Comment: 31 page
    corecore