1,878 research outputs found
Optimal, scalable forward models for computing gravity anomalies
We describe three approaches for computing a gravity signal from a density
anomaly. The first approach consists of the classical "summation" technique,
whilst the remaining two methods solve the Poisson problem for the
gravitational potential using either a Finite Element (FE) discretization
employing a multilevel preconditioner, or a Green's function evaluated with the
Fast Multipole Method (FMM). The methods utilizing the PDE formulation
described here differ from previously published approaches used in gravity
modeling in that they are optimal, implying that both the memory and
computational time required scale linearly with respect to the number of
unknowns in the potential field. Additionally, all of the implementations
presented here are developed such that the computations can be performed in a
massively parallel, distributed memory computing environment. Through numerical
experiments, we compare the methods on the basis of their discretization error,
CPU time and parallel scalability. We demonstrate the parallel scalability of
all these techniques by running forward models with up to voxels on
1000's of cores.Comment: 38 pages, 13 figures; accepted by Geophysical Journal Internationa
Petascale turbulence simulation using a highly parallel fast multipole method on GPUs
This paper reports large-scale direct numerical simulations of
homogeneous-isotropic fluid turbulence, achieving sustained performance of 1.08
petaflop/s on gpu hardware using single precision. The simulations use a vortex
particle method to solve the Navier-Stokes equations, with a highly parallel
fast multipole method (FMM) as numerical engine, and match the current record
in mesh size for this application, a cube of 4096^3 computational points solved
with a spectral method. The standard numerical approach used in this field is
the pseudo-spectral method, relying on the FFT algorithm as numerical engine.
The particle-based simulations presented in this paper quantitatively match the
kinetic energy spectrum obtained with a pseudo-spectral method, using a trusted
code. In terms of parallel performance, weak scaling results show the fmm-based
vortex method achieving 74% parallel efficiency on 4096 processes (one gpu per
mpi process, 3 gpus per node of the TSUBAME-2.0 system). The FFT-based spectral
method is able to achieve just 14% parallel efficiency on the same number of
mpi processes (using only cpu cores), due to the all-to-all communication
pattern of the FFT algorithm. The calculation time for one time step was 108
seconds for the vortex method and 154 seconds for the spectral method, under
these conditions. Computing with 69 billion particles, this work exceeds by an
order of magnitude the largest vortex method calculations to date
A parallel Heap-Cell Method for Eikonal equations
Numerous applications of Eikonal equations prompted the development of many
efficient numerical algorithms. The Heap-Cell Method (HCM) is a recent serial
two-scale technique that has been shown to have advantages over other serial
state-of-the-art solvers for a wide range of problems. This paper presents a
parallelization of HCM for a shared memory architecture. The numerical
experiments in show that the parallel HCM exhibits good algorithmic
behavior and scales well, resulting in a very fast and practical solver.
We further explore the influence on performance and scaling of data
precision, early termination criteria, and the hardware architecture. A shorter
version of this manuscript (omitting these more detailed tests) has been
submitted to SIAM Journal on Scientific Computing in 2012.Comment: (a minor update to address the reviewers' comments) 31 pages; 15
figures; this is an expanded version of a paper accepted by SIAM Journal on
Scientific Computin
- …