22,723 research outputs found
A Lower Bound Technique for Communication in BSP
Communication is a major factor determining the performance of algorithms on
current computing systems; it is therefore valuable to provide tight lower
bounds on the communication complexity of computations. This paper presents a
lower bound technique for the communication complexity in the bulk-synchronous
parallel (BSP) model of a given class of DAG computations. The derived bound is
expressed in terms of the switching potential of a DAG, that is, the number of
permutations that the DAG can realize when viewed as a switching network. The
proposed technique yields tight lower bounds for the fast Fourier transform
(FFT), and for any sorting and permutation network. A stronger bound is also
derived for the periodic balanced sorting network, by applying this technique
to suitable subnetworks. Finally, we demonstrate that the switching potential
captures communication requirements even in computational models different from
BSP, such as the I/O model and the LPRAM
Faster 3-Periodic Merging Networks
We consider the problem of merging two sorted sequences on a comparator
network that is used repeatedly, that is, if the output is not sorted, the
network is applied again using the output as input. The challenging task is to
construct such networks of small depth. The first constructions of merging
networks with a constant period were given by Kuty{\l}owski, Lory\'s and
Oesterdikhoff. They have given -periodic network that merges two sorted
sequences of numbers in time and a similar network of period
that works in . We present a new family of such networks that are
based on Canfield and Williamson periodic sorter. Our -periodic merging
networks work in time upper-bounded by . The construction can be
easily generalized to larger constant periods with decreasing running time, for
example, to -periodic ones that work in time upper-bounded by .
Moreover, to obtain the facts we have introduced a new proof technique
Group-theoretic models of the inversion process in bacterial genomes
The variation in genome arrangements among bacterial taxa is largely due to
the process of inversion. Recent studies indicate that not all inversions are
equally probable, suggesting, for instance, that shorter inversions are more
frequent than longer, and those that move the terminus of replication are less
probable than those that do not. Current methods for establishing the inversion
distance between two bacterial genomes are unable to incorporate such
information. In this paper we suggest a group-theoretic framework that in
principle can take these constraints into account. In particular, we show that
by lifting the problem from circular permutations to the affine symmetric
group, the inversion distance can be found in polynomial time for a model in
which inversions are restricted to acting on two regions. This requires the
proof of new results in group theory, and suggests a vein of new combinatorial
problems concerning permutation groups on which group theorists will be needed
to collaborate with biologists. We apply the new method to inferring distances
and phylogenies for published Yersinia pestis data.Comment: 19 pages, 7 figures, in Press, Journal of Mathematical Biolog
Petascale turbulence simulation using a highly parallel fast multipole method on GPUs
This paper reports large-scale direct numerical simulations of
homogeneous-isotropic fluid turbulence, achieving sustained performance of 1.08
petaflop/s on gpu hardware using single precision. The simulations use a vortex
particle method to solve the Navier-Stokes equations, with a highly parallel
fast multipole method (FMM) as numerical engine, and match the current record
in mesh size for this application, a cube of 4096^3 computational points solved
with a spectral method. The standard numerical approach used in this field is
the pseudo-spectral method, relying on the FFT algorithm as numerical engine.
The particle-based simulations presented in this paper quantitatively match the
kinetic energy spectrum obtained with a pseudo-spectral method, using a trusted
code. In terms of parallel performance, weak scaling results show the fmm-based
vortex method achieving 74% parallel efficiency on 4096 processes (one gpu per
mpi process, 3 gpus per node of the TSUBAME-2.0 system). The FFT-based spectral
method is able to achieve just 14% parallel efficiency on the same number of
mpi processes (using only cpu cores), due to the all-to-all communication
pattern of the FFT algorithm. The calculation time for one time step was 108
seconds for the vortex method and 154 seconds for the spectral method, under
these conditions. Computing with 69 billion particles, this work exceeds by an
order of magnitude the largest vortex method calculations to date
Sorting Omega Networks Simulated with P Systems: Optimal Data Layouts
The paper introduces some sorting networks and their simulation with P
systems, in which each processor/membrane can hold more than one piece of data, and
perform operations on them internally. Several data layouts are discussed in this context,
and an optimal one is proposed, together with its implementation as a P system with
dynamic communication graphs
Persistent dynamic attractors in activity patterns of cultured neuronal networks
Three remarkable features of the nervous system—complex spatiotemporal patterns, oscillations, and persistent activity—are fundamental to such diverse functions as stereotypical motor behavior, working memory, and awareness. Here we report that cultured cortical networks spontaneously generate a hierarchical structure of periodic activity with a strongly stereotyped population-wide spatiotemporal structure demonstrating all three fundamental properties in a recurring pattern. During these "superbursts," the firing sequence of the culture periodically converges to a dynamic attractor orbit. Precursors of oscillations and persistent activity have previously been reported as intrinsic properties of the neurons. However, complex spatiotemporal patterns that are coordinated in a large population of neurons and persist over several hours—and thus are capable of representing and preserving information—cannot be explained by known oscillatory properties of isolated neurons. Instead, the complexity of the observed spatiotemporal patterns implies large-scale self-organization of neurons interacting in a precise temporal order even in vitro, in cultures usually considered to have random connectivity
- …