Search CORE

22,723 research outputs found

A Lower Bound Technique for Communication in BSP

Author: Bilardi Gianfranco
Scquizzato Michele
Silvestri Francesco
Publication venue
Publication date: 25/11/2017
Field of study

Communication is a major factor determining the performance of algorithms on current computing systems; it is therefore valuable to provide tight lower bounds on the communication complexity of computations. This paper presents a lower bound technique for the communication complexity in the bulk-synchronous parallel (BSP) model of a given class of DAG computations. The derived bound is expressed in terms of the switching potential of a DAG, that is, the number of permutations that the DAG can realize when viewed as a switching network. The proposed technique yields tight lower bounds for the fast Fourier transform (FFT), and for any sorting and permutation network. A stronger bound is also derived for the periodic balanced sorting network, by applying this technique to suitable subnetworks. Finally, we demonstrate that the switching potential captures communication requirements even in computational models different from BSP, such as the I/O model and the LPRAM

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Padova

Faster 3-Periodic Merging Networks

Author: Piotrów Marek
Publication venue
Publication date: 02/01/2014
Field of study

We consider the problem of merging two sorted sequences on a comparator network that is used repeatedly, that is, if the output is not sorted, the network is applied again using the output as input. The challenging task is to construct such networks of small depth. The first constructions of merging networks with a constant period were given by Kuty{\l}owski, Lory\'s and Oesterdikhoff. They have given

3

-periodic network that merges two sorted sequences of

N

numbers in time

12\log N

and a similar network of period

4

that works in

5.67\log N

. We present a new family of such networks that are based on Canfield and Williamson periodic sorter. Our

3

-periodic merging networks work in time upper-bounded by

6\log N

. The construction can be easily generalized to larger constant periods with decreasing running time, for example, to

4

-periodic ones that work in time upper-bounded by

4\log N

. Moreover, to obtain the facts we have introduced a new proof technique

arXiv.org e-Print Archive

Crossref

Group-theoretic models of the inversion process in bacterial genomes

Author: Egri-Nagy Attila
Francis Andrew R.
Gebhardt Volker
Tanaka Mark M.
Publication venue
Publication date: 01/01/2014
Field of study

The variation in genome arrangements among bacterial taxa is largely due to the process of inversion. Recent studies indicate that not all inversions are equally probable, suggesting, for instance, that shorter inversions are more frequent than longer, and those that move the terminus of replication are less probable than those that do not. Current methods for establishing the inversion distance between two bacterial genomes are unable to incorporate such information. In this paper we suggest a group-theoretic framework that in principle can take these constraints into account. In particular, we show that by lifting the problem from circular permutations to the affine symmetric group, the inversion distance can be found in polynomial time for a model in which inversions are restricted to acting on two regions. This requires the proof of new results in group theory, and suggests a vein of new combinatorial problems concerning permutation groups on which group theorists will be needed to collaborate with biologists. We apply the new method to inferring distances and phylogenies for published Yersinia pestis data.Comment: 19 pages, 7 figures, in Press, Journal of Mathematical Biolog

arXiv.org e-Print Archive

Western Sydney ResearchDirect

Petascale turbulence simulation using a highly parallel fast multipole method on GPUs

Author: Barnes
Chatelain
Cheng
Cottet
Davidson
Dehnen
Gingold
Greengard
Hamada
Ishihara
Kenji Yasuoka
L.A. Barba
Lambert
Rahimian
Rio Yokota
Salmon
Sundar
Tetsu Narumi
Warren
Warren
Yokokawa
Yokota
Yokota
Yokota
Yokota
Yokota
Publication venue: 'Elsevier BV'
Publication date: 03/09/2012
Field of study

This paper reports large-scale direct numerical simulations of homogeneous-isotropic fluid turbulence, achieving sustained performance of 1.08 petaflop/s on gpu hardware using single precision. The simulations use a vortex particle method to solve the Navier-Stokes equations, with a highly parallel fast multipole method (FMM) as numerical engine, and match the current record in mesh size for this application, a cube of 4096^3 computational points solved with a spectral method. The standard numerical approach used in this field is the pseudo-spectral method, relying on the FFT algorithm as numerical engine. The particle-based simulations presented in this paper quantitatively match the kinetic energy spectrum obtained with a pseudo-spectral method, using a trusted code. In terms of parallel performance, weak scaling results show the fmm-based vortex method achieving 74% parallel efficiency on 4096 processes (one gpu per mpi process, 3 gpus per node of the TSUBAME-2.0 system). The FFT-based spectral method is able to achieve just 14% parallel efficiency on the same number of mpi processes (using only cpu cores), due to the all-to-all communication pattern of the FFT algorithm. The calculation time for one time step was 108 seconds for the vortex method and 154 seconds for the spectral method, under these conditions. Computing with 69 billion particles, this work exceeds by an order of magnitude the largest vortex method calculations to date

arXiv.org e-Print Archive

Crossref

Sorting Omega Networks Simulated with P Systems: Optimal Data Layouts

Author: Ceterchi Rodica
Pérez Jiménez Mario de Jesús
Tomescu Alexandru Ioan
Publication venue: Fénix Editora
Publication date: 01/01/2008
Field of study

The paper introduces some sorting networks and their simulation with P systems, in which each processor/membrane can hold more than one piece of data, and perform operations on them internally. Several data layouts are discussed in this context, and an optimal one is proposed, together with its implementation as a P system with dynamic communication graphs

idUS. Depósito de Investigación Universidad de Sevilla

Persistent dynamic attractors in activity patterns of cultured neuronal networks

Author: Nádasdy Zoltan
Potter Steve M.
Wagenaar Daniel A.
Publication venue: 'American Physical Society (APS)'
Publication date: 01/05/2006
Field of study

Three remarkable features of the nervous system—complex spatiotemporal patterns, oscillations, and persistent activity—are fundamental to such diverse functions as stereotypical motor behavior, working memory, and awareness. Here we report that cultured cortical networks spontaneously generate a hierarchical structure of periodic activity with a strongly stereotyped population-wide spatiotemporal structure demonstrating all three fundamental properties in a recurring pattern. During these "superbursts," the firing sequence of the culture periodically converges to a dynamic attractor orbit. Precursors of oscillations and persistent activity have previously been reported as intrinsic properties of the neurons. However, complex spatiotemporal patterns that are coordinated in a large population of neurons and persist over several hours—and thus are capable of representing and preserving information—cannot be explained by known oscillatory properties of isolated neurons. Instead, the complexity of the observed spatiotemporal patterns implies large-scale self-organization of neurons interacting in a precise temporal order even in vitro, in cultures usually considered to have random connectivity

Caltech Authors