108,660 research outputs found

    A Distributed Algorithm for Directed Minimum-Weight Spanning Tree

    Get PDF

    Distributed Computing on Core-Periphery Networks: Axiom-based Design

    Full text link
    Inspired by social networks and complex systems, we propose a core-periphery network architecture that supports fast computation for many distributed algorithms and is robust and efficient in number of links. Rather than providing a concrete network model, we take an axiom-based design approach. We provide three intuitive (and independent) algorithmic axioms and prove that any network that satisfies all axioms enjoys an efficient algorithm for a range of tasks (e.g., MST, sparse matrix multiplication, etc.). We also show the minimality of our axiom set: for networks that satisfy any subset of the axioms, the same efficiency cannot be guaranteed for any deterministic algorithm

    Parametric shortest-path algorithms via tropical geometry

    Full text link
    We study parameterized versions of classical algorithms for computing shortest-path trees. This is most easily expressed in terms of tropical geometry. Applications include shortest paths in traffic networks with variable link travel times.Comment: 24 pages and 8 figure

    Petascale turbulence simulation using a highly parallel fast multipole method on GPUs

    Full text link
    This paper reports large-scale direct numerical simulations of homogeneous-isotropic fluid turbulence, achieving sustained performance of 1.08 petaflop/s on gpu hardware using single precision. The simulations use a vortex particle method to solve the Navier-Stokes equations, with a highly parallel fast multipole method (FMM) as numerical engine, and match the current record in mesh size for this application, a cube of 4096^3 computational points solved with a spectral method. The standard numerical approach used in this field is the pseudo-spectral method, relying on the FFT algorithm as numerical engine. The particle-based simulations presented in this paper quantitatively match the kinetic energy spectrum obtained with a pseudo-spectral method, using a trusted code. In terms of parallel performance, weak scaling results show the fmm-based vortex method achieving 74% parallel efficiency on 4096 processes (one gpu per mpi process, 3 gpus per node of the TSUBAME-2.0 system). The FFT-based spectral method is able to achieve just 14% parallel efficiency on the same number of mpi processes (using only cpu cores), due to the all-to-all communication pattern of the FFT algorithm. The calculation time for one time step was 108 seconds for the vortex method and 154 seconds for the spectral method, under these conditions. Computing with 69 billion particles, this work exceeds by an order of magnitude the largest vortex method calculations to date

    Parallel Graph Decompositions Using Random Shifts

    Full text link
    We show an improved parallel algorithm for decomposing an undirected unweighted graph into small diameter pieces with a small fraction of the edges in between. These decompositions form critical subroutines in a number of graph algorithms. Our algorithm builds upon the shifted shortest path approach introduced in [Blelloch, Gupta, Koutis, Miller, Peng, Tangwongsan, SPAA 2011]. By combining various stages of the previous algorithm, we obtain a significantly simpler algorithm with the same asymptotic guarantees as the best sequential algorithm
    • …
    corecore