22,723 research outputs found

    A Lower Bound Technique for Communication in BSP

    Get PDF
    Communication is a major factor determining the performance of algorithms on current computing systems; it is therefore valuable to provide tight lower bounds on the communication complexity of computations. This paper presents a lower bound technique for the communication complexity in the bulk-synchronous parallel (BSP) model of a given class of DAG computations. The derived bound is expressed in terms of the switching potential of a DAG, that is, the number of permutations that the DAG can realize when viewed as a switching network. The proposed technique yields tight lower bounds for the fast Fourier transform (FFT), and for any sorting and permutation network. A stronger bound is also derived for the periodic balanced sorting network, by applying this technique to suitable subnetworks. Finally, we demonstrate that the switching potential captures communication requirements even in computational models different from BSP, such as the I/O model and the LPRAM

    Faster 3-Periodic Merging Networks

    Full text link
    We consider the problem of merging two sorted sequences on a comparator network that is used repeatedly, that is, if the output is not sorted, the network is applied again using the output as input. The challenging task is to construct such networks of small depth. The first constructions of merging networks with a constant period were given by Kuty{\l}owski, Lory\'s and Oesterdikhoff. They have given 33-periodic network that merges two sorted sequences of NN numbers in time 12logN12\log N and a similar network of period 44 that works in 5.67logN5.67\log N. We present a new family of such networks that are based on Canfield and Williamson periodic sorter. Our 33-periodic merging networks work in time upper-bounded by 6logN6\log N. The construction can be easily generalized to larger constant periods with decreasing running time, for example, to 44-periodic ones that work in time upper-bounded by 4logN4\log N. Moreover, to obtain the facts we have introduced a new proof technique

    Group-theoretic models of the inversion process in bacterial genomes

    Full text link
    The variation in genome arrangements among bacterial taxa is largely due to the process of inversion. Recent studies indicate that not all inversions are equally probable, suggesting, for instance, that shorter inversions are more frequent than longer, and those that move the terminus of replication are less probable than those that do not. Current methods for establishing the inversion distance between two bacterial genomes are unable to incorporate such information. In this paper we suggest a group-theoretic framework that in principle can take these constraints into account. In particular, we show that by lifting the problem from circular permutations to the affine symmetric group, the inversion distance can be found in polynomial time for a model in which inversions are restricted to acting on two regions. This requires the proof of new results in group theory, and suggests a vein of new combinatorial problems concerning permutation groups on which group theorists will be needed to collaborate with biologists. We apply the new method to inferring distances and phylogenies for published Yersinia pestis data.Comment: 19 pages, 7 figures, in Press, Journal of Mathematical Biolog

    Petascale turbulence simulation using a highly parallel fast multipole method on GPUs

    Full text link
    This paper reports large-scale direct numerical simulations of homogeneous-isotropic fluid turbulence, achieving sustained performance of 1.08 petaflop/s on gpu hardware using single precision. The simulations use a vortex particle method to solve the Navier-Stokes equations, with a highly parallel fast multipole method (FMM) as numerical engine, and match the current record in mesh size for this application, a cube of 4096^3 computational points solved with a spectral method. The standard numerical approach used in this field is the pseudo-spectral method, relying on the FFT algorithm as numerical engine. The particle-based simulations presented in this paper quantitatively match the kinetic energy spectrum obtained with a pseudo-spectral method, using a trusted code. In terms of parallel performance, weak scaling results show the fmm-based vortex method achieving 74% parallel efficiency on 4096 processes (one gpu per mpi process, 3 gpus per node of the TSUBAME-2.0 system). The FFT-based spectral method is able to achieve just 14% parallel efficiency on the same number of mpi processes (using only cpu cores), due to the all-to-all communication pattern of the FFT algorithm. The calculation time for one time step was 108 seconds for the vortex method and 154 seconds for the spectral method, under these conditions. Computing with 69 billion particles, this work exceeds by an order of magnitude the largest vortex method calculations to date

    Sorting Omega Networks Simulated with P Systems: Optimal Data Layouts

    Get PDF
    The paper introduces some sorting networks and their simulation with P systems, in which each processor/membrane can hold more than one piece of data, and perform operations on them internally. Several data layouts are discussed in this context, and an optimal one is proposed, together with its implementation as a P system with dynamic communication graphs

    Persistent dynamic attractors in activity patterns of cultured neuronal networks

    Get PDF
    Three remarkable features of the nervous system—complex spatiotemporal patterns, oscillations, and persistent activity—are fundamental to such diverse functions as stereotypical motor behavior, working memory, and awareness. Here we report that cultured cortical networks spontaneously generate a hierarchical structure of periodic activity with a strongly stereotyped population-wide spatiotemporal structure demonstrating all three fundamental properties in a recurring pattern. During these "superbursts," the firing sequence of the culture periodically converges to a dynamic attractor orbit. Precursors of oscillations and persistent activity have previously been reported as intrinsic properties of the neurons. However, complex spatiotemporal patterns that are coordinated in a large population of neurons and persist over several hours—and thus are capable of representing and preserving information—cannot be explained by known oscillatory properties of isolated neurons. Instead, the complexity of the observed spatiotemporal patterns implies large-scale self-organization of neurons interacting in a precise temporal order even in vitro, in cultures usually considered to have random connectivity
    corecore