26,670 research outputs found
The Parallel Complexity of Growth Models
This paper investigates the parallel complexity of several non-equilibrium
growth models. Invasion percolation, Eden growth, ballistic deposition and
solid-on-solid growth are all seemingly highly sequential processes that yield
self-similar or self-affine random clusters. Nonetheless, we present fast
parallel randomized algorithms for generating these clusters. The running times
of the algorithms scale as , where is the system size, and the
number of processors required scale as a polynomial in . The algorithms are
based on fast parallel procedures for finding minimum weight paths; they
illuminate the close connection between growth models and self-avoiding paths
in random environments. In addition to their potential practical value, our
algorithms serve to classify these growth models as less complex than other
growth models, such as diffusion-limited aggregation, for which fast parallel
algorithms probably do not exist.Comment: 20 pages, latex, submitted to J. Stat. Phys., UNH-TR94-0
On the Decoding of Polar Codes on Permuted Factor Graphs
Polar codes are a channel coding scheme for the next generation of wireless
communications standard (5G). The belief propagation (BP) decoder allows for
parallel decoding of polar codes, making it suitable for high throughput
applications. However, the error-correction performance of polar codes under BP
decoding is far from the requirements of 5G. It has been shown that the
error-correction performance of BP can be improved if the decoding is performed
on multiple permuted factor graphs of polar codes. However, a different BP
decoding scheduling is required for each factor graph permutation which results
in the design of a different decoder for each permutation. Moreover, the
selection of the different factor graph permutations is at random, which
prevents the decoder to achieve a desirable error-correction performance with a
small number of permutations. In this paper, we first show that the
permutations on the factor graph can be mapped into suitable permutations on
the codeword positions. As a result, we can make use of a single decoder for
all the permutations. In addition, we introduce a method to construct a set of
predetermined permutations which can provide the correct codeword if the
decoding fails on the original permutation. We show that for the 5G polar code
of length , the error-correction performance of the proposed decoder is
more than dB better than that of the BP decoder with the same number of
random permutations at the frame error rate of
Parallel Weighted Random Sampling
Data structures for efficient sampling from a set of weighted items are an important building block of many applications. However, few parallel solutions are known. We close many of these gaps both for shared-memory and distributed-memory machines. We give efficient, fast, and practicable algorithms for sampling single items, k items with/without replacement, permutations, subsets, and reservoirs. We also give improved sequential algorithms for alias table construction and for sampling with replacement. Experiments on shared-memory parallel machines with up to 158 threads show near linear speedups both for construction and queries
Optimal Discrete Uniform Generation from Coin Flips, and Applications
This article introduces an algorithm to draw random discrete uniform
variables within a given range of size n from a source of random bits. The
algorithm aims to be simple to implement and optimal both with regards to the
amount of random bits consumed, and from a computational perspective---allowing
for faster and more efficient Monte-Carlo simulations in computational physics
and biology. I also provide a detailed analysis of the number of bits that are
spent per variate, and offer some extensions and applications, in particular to
the optimal random generation of permutations.Comment: first draft, 22 pages, 5 figures, C code implementation of algorith
Pruned Bit-Reversal Permutations: Mathematical Characterization, Fast Algorithms and Architectures
A mathematical characterization of serially-pruned permutations (SPPs)
employed in variable-length permuters and their associated fast pruning
algorithms and architectures are proposed. Permuters are used in many signal
processing systems for shuffling data and in communication systems as an
adjunct to coding for error correction. Typically only a small set of discrete
permuter lengths are supported. Serial pruning is a simple technique to alter
the length of a permutation to support a wider range of lengths, but results in
a serial processing bottleneck. In this paper, parallelizing SPPs is formulated
in terms of recursively computing sums involving integer floor and related
functions using integer operations, in a fashion analogous to evaluating
Dedekind sums. A mathematical treatment for bit-reversal permutations (BRPs) is
presented, and closed-form expressions for BRP statistics are derived. It is
shown that BRP sequences have weak correlation properties. A new statistic
called permutation inliers that characterizes the pruning gap of pruned
interleavers is proposed. Using this statistic, a recursive algorithm that
computes the minimum inliers count of a pruned BR interleaver (PBRI) in
logarithmic time complexity is presented. This algorithm enables parallelizing
a serial PBRI algorithm by any desired parallelism factor by computing the
pruning gap in lookahead rather than a serial fashion, resulting in significant
reduction in interleaving latency and memory overhead. Extensions to 2-D block
and stream interleavers, as well as applications to pruned fast Fourier
transforms and LTE turbo interleavers, are also presented. Moreover,
hardware-efficient architectures for the proposed algorithms are developed.
Simulation results demonstrate 3 to 4 orders of magnitude improvement in
interleaving time compared to existing approaches.Comment: 31 page
- …