42 research outputs found
Multibody Multipole Methods
A three-body potential function can account for interactions among triples of
particles which are uncaptured by pairwise interaction functions such as
Coulombic or Lennard-Jones potentials. Likewise, a multibody potential of order
can account for interactions among -tuples of particles uncaptured by
interaction functions of lower orders. To date, the computation of multibody
potential functions for a large number of particles has not been possible due
to its scaling cost. In this paper we describe a fast tree-code for
efficiently approximating multibody potentials that can be factorized as
products of functions of pairwise distances. For the first time, we show how to
derive a Barnes-Hut type algorithm for handling interactions among more than
two particles. Our algorithm uses two approximation schemes: 1) a deterministic
series expansion-based method; 2) a Monte Carlo-based approximation based on
the central limit theorem. Our approach guarantees a user-specified bound on
the absolute or relative error in the computed potential with an asymptotic
probability guarantee. We provide speedup results on a three-body dispersion
potential, the Axilrod-Teller potential.Comment: To appear in Journal of Computational Physic
Fast Multipole Methods for Three-Dimensional N-body Problems
We are developing computational tools for the simulations of three-dimensional flows past bodies undergoing arbitrary motions. High resolution viscous vortex methods have been developed that allow for extended simulations of two-dimensional configurations such as vortex generators. Our objective is to extend this methodology to three dimensions and develop a robust computational scheme for the simulation of such flows. A fundamental issue in the use of vortex methods is the ability of employing efficiently large numbers of computational elements to resolve the large range of scales that exist in complex flows. The traditional cost of the method scales as Omicron (N(sup 2)) as the N computational elements/particles induce velocities at each other, making the method unacceptable for simulations involving more than a few tens of thousands of particles. In the last decade fast methods have been developed that have operation counts of Omicron (N log N) or Omicron (N) (referred to as BH and GR respectively) depending on the details of the algorithm. These methods are based on the observation that the effect of a cluster of particles at a certain distance may be approximated by a finite series expansion. In order to exploit this observation we need to decompose the element population spatially into clusters of particles and build a hierarchy of clusters (a tree data structure) - smaller neighboring clusters combine to form a cluster of the next size up in the hierarchy and so on. This hierarchy of clusters allows one to determine efficiently when the approximation is valid. This algorithm is an N-body solver that appears in many fields of engineering and science. Some examples of its diverse use are in astrophysics, molecular dynamics, micro-magnetics, boundary element simulations of electromagnetic problems, and computer animation. More recently these N-body solvers have been implemented and applied in simulations involving vortex methods. Koumoutsakos and Leonard (1995) implemented the GR scheme in two dimensions for vector computer architectures allowing for simulations of bluff body flows using millions of particles. Winckelmans presented three-dimensional, viscous simulations of interacting vortex rings, using vortons and an implementation of a BH scheme for parallel computer architectures. Bhatt presented a vortex filament method to perform inviscid vortex ring interactions, with an alternative implementation of a BH scheme for a Connection Machine parallel computer architecture
A new parallel kernel-independent fast multipole method
We present a new adaptive fast multipole algorithm and its parallel implementation. The algorithm is kernel-independent in the sense that the evaluation of pairwise interactions does not rely on any analytic expansions, but only utilizes kernel evaluations. The new method provides the enabling technology for many important problems in computational science and engineering. Examples include viscous flows, fracture mechanics and screened Coulombic interactions. Our MPI-based parallel implementation logically separates the computation and communication phases to avoid synchronization in the upward and downward computation passes, and thus allows us to fully exploit computation and communication overlapping. We measure isogranular and fixed-size scalability for a variety of kernels on the Pittsburgh Supercomputing Center\u27s TCS-1 Alphaserver on up to 3000 processors. We have solved viscous flow problems with up to 2.1 billion unknowns and we have achieved 1.6 Tflops/s peak performance and 1.13 Tflops/s sustained performance