2,607 research outputs found
Solutions of large-scale electromagnetics problems involving dielectric objects with the parallel multilevel fast multipole algorithm
Fast and accurate solutions of large-scale electromagnetics problems involving homogeneous dielectric objects are considered. Problems are formulated with the electric and magnetic current combined-field integral equation and discretized with the Rao-Wilton-Glisson functions. Solutions are performed iteratively by using the multi-level fast multipole algorithm (MLFMA). For the solution of large-scale problems discretized with millions of unknowns, MLFMA is parallelized on distributed-memory architectures using a rigorous technique, namely, the hierarchical partitioning strategy. Efficiency and accuracy of the developed implementation are demonstrated on very large problems involving as many as 100 million unknowns
A Parallel Adaptive P3M code with Hierarchical Particle Reordering
We discuss the design and implementation of HYDRA_OMP a parallel
implementation of the Smoothed Particle Hydrodynamics-Adaptive P3M (SPH-AP3M)
code HYDRA. The code is designed primarily for conducting cosmological
hydrodynamic simulations and is written in Fortran77+OpenMP. A number of
optimizations for RISC processors and SMP-NUMA architectures have been
implemented, the most important optimization being hierarchical reordering of
particles within chaining cells, which greatly improves data locality thereby
removing the cache misses typically associated with linked lists. Parallel
scaling is good, with a minimum parallel scaling of 73% achieved on 32 nodes
for a variety of modern SMP architectures. We give performance data in terms of
the number of particle updates per second, which is a more useful performance
metric than raw MFlops. A basic version of the code will be made available to
the community in the near future.Comment: 34 pages, 12 figures, accepted for publication in Computer Physics
Communication
An adaptive hierarchical domain decomposition method for parallel contact dynamics simulations of granular materials
A fully parallel version of the contact dynamics (CD) method is presented in
this paper. For large enough systems, 100% efficiency has been demonstrated for
up to 256 processors using a hierarchical domain decomposition with dynamic
load balancing. The iterative scheme to calculate the contact forces is left
domain-wise sequential, with data exchange after each iteration step, which
ensures its stability. The number of additional iterations required for
convergence by the partially parallel updates at the domain boundaries becomes
negligible with increasing number of particles, which allows for an effective
parallelization. Compared to the sequential implementation, we found no
influence of the parallelization on simulation results.Comment: 19 pages, 15 figures, published in Journal of Computational Physics
(2011
- …