2 research outputs found

    An Efficient Load Balancing Technique for Parallel FMA in Message Passing Environment

    No full text
    The N-body simulation has been used extensively in study of the dynamics of galactic systems, fluid, and biomolecules. It is known to be computational bound due to direct force calculation among bodies in the system. The time complexity is O(N 2 ) where N is the number of bodies. Fast multipole algorithm, proposed by Greengard and Rokhlin, reduces the complexity to O(N ). Tremendous amount of work had been devoted to parallelization of fast multipole algorithm for uniformly distributed particles. However, the particles in many applications are distributed nonuniformly. This poses the problem of load imbalancing among processors which, in turn increases total computational cost. Existing partitioning techniques do not work well due to the tight relationship in the translations of multipole and local expansions when applying to parallel fast multipole algorithm in message passing environment. In this paper, we propose a new partitioning technique called weighted subtrees and present it..

    Distribution independent parallel algorithms and software for hierarchical methods with applications to computational electromagnetics

    Get PDF
    Octrees are tree data structures used to represent multidimensional points in space. They are widely used in supporting hierarchical methods for scientific applications such as the N-body problem, molecular dynamics and smoothed particle hydrodynamics. The size of an octree is known to be dependent on the spatial distribution of points in the computational domain and is not just a function of the number of points. For this reason, run-time of an algorithm using octree that depends on the size of the octree is unknown for arbitrary distributions. In this thesis, we present the design and implementation of parallel algorithms for construction of compressed octrees and queries that are typically used by hierarchical methods. Our parallel algorithms and implementation strategies perform well irrespective of the spatial distribution of data, are communication efficient, and require no explicit load balancing. We also developed a software library which provides the functionality of parallel tree construction and various queries on compressed octrees. The purpose of the library is to enable rapid development of applications and to allow application developers to use efficient parallel algorithms without necessity of having detailed knowledge of the algorithms or of implementing them. To demonstrate the performance of our algorithms and to show the effectiveness of the library, we developed a complete end-to-end parallel electromagnetics code for computing the scattered electromagnetic fields from a Perfect Electrically Conducting surface. We used the functions provided by the software library to develop a Fast Multipole Method based solution to this problem. Experimental results show that our algorithms scale well and have bounded communication irrespective of the shape of the scatterer
    corecore