3,159 research outputs found
Development of Fast Algorithms Using Recursion, Nesting and Iterations for Computational Electromagnetics
In the first phase of our work, we have concentrated on laying the foundation to develop fast algorithms, including the use of recursive structure like the recursive aggregate interaction matrix algorithm (RAIMA), the nested equivalence principle algorithm (NEPAL), the ray-propagation fast multipole algorithm (RPFMA), and the multi-level fast multipole algorithm (MLFMA). We have also investigated the use of curvilinear patches to build a basic method of moments code where these acceleration techniques can be used later. In the second phase, which is mainly reported on here, we have concentrated on implementing three-dimensional NEPAL on a massively parallel machine, the Connection Machine CM-5, and have been able to obtain some 3D scattering results. In order to understand the parallelization of codes on the Connection Machine, we have also studied the parallelization of 3D finite-difference time-domain (FDTD) code with PML material absorbing boundary condition (ABC). We found that simple algorithms like the FDTD with material ABC can be parallelized very well allowing us to solve within a minute a problem of over a million nodes. In addition, we have studied the use of the fast multipole method and the ray-propagation fast multipole algorithm to expedite matrix-vector multiplication in a conjugate-gradient solution to integral equations of scattering. We find that these methods are faster than LU decomposition for one incident angle, but are slower than LU decomposition when many incident angles are needed as in the monostatic RCS calculations
A pilgrimage to gravity on GPUs
In this short review we present the developments over the last 5 decades that
have led to the use of Graphics Processing Units (GPUs) for astrophysical
simulations. Since the introduction of NVIDIA's Compute Unified Device
Architecture (CUDA) in 2007 the GPU has become a valuable tool for N-body
simulations and is so popular these days that almost all papers about high
precision N-body simulations use methods that are accelerated by GPUs. With the
GPU hardware becoming more advanced and being used for more advanced algorithms
like gravitational tree-codes we see a bright future for GPU like hardware in
computational astrophysics.Comment: To appear in: European Physical Journal "Special Topics" : "Computer
Simulations on Graphics Processing Units" . 18 pages, 8 figure
A fast recursive coordinate bisection tree for neighbour search and gravity
We introduce our new binary tree code for neighbour search and gravitational
force calculations in an N-particle system. The tree is built in a "top-down"
fashion by "recursive coordinate bisection" where on each tree level we split
the longest side of a cell through its centre of mass. This procedure continues
until the average number of particles in the lowest tree level has dropped
below a prescribed value. To calculate the forces on the particles in each
lowest-level cell we split the gravitational interaction into a near- and a
far-field. Since our main intended applications are SPH simulations, we
calculate the near-field by a direct, kernel-smoothed summation, while the far
field is evaluated via a Cartesian Taylor expansion up to quadrupole order.
Instead of applying the far-field approach for each particle separately, we use
another Taylor expansion around the centre of mass of each lowest-level cell to
determine the forces at the particle positions. Due to this "cell-cell
interaction" the code performance is close to O(N) where N is the number of
used particles. We describe in detail various technicalities that ensure a low
memory footprint and an efficient cache use.
In a set of benchmark tests we scrutinize our new tree and compare it to the
"Press tree" that we have previously made ample use of. At a slightly higher
force accuracy than the Press tree, our tree turns out to be substantially
faster and increasingly more so for larger particle numbers. For four million
particles our tree build is faster by a factor of 25 and the time for neighbour
search and gravity is reduced by more than a factor of 6. In single processor
tests with up to 10^8 particles we confirm experimentally that the scaling
behaviour is close to O(N). The current Fortran 90 code version is
OpenMP-parallel and scales excellently with the processor number (=24) of our
test machine.Comment: 12 pages, 16 figures, 1 table, accepted for publication in MNRAS on
July 28, 201
A hybrid MLFMM-UTD method for the solution of very large 2-D electromagnetic problems
The multilevel fast multipole method (MLFMM) is combined with the uniform theory of diffraction (UTD) to model two-dimensional (2-D) scattering problems including very large scatterers. The discretization of the very large scatterers is avoided by using ray-based methods. Reflections are accounted for by image source theory, while for diffraction a new MLFMM translation matrix is introduced. The translation matrix elements are derived based on a technique that generalizes the use of UTD for arbitrary source configurations and that efficiently describes the field over extended regions of space. O(n) scaling of the computational time and memory requirements is achieved for relevant structures, such as large antenna arrays in the presence of a wedge. The theory is validated by means of several illustrative numerical examples and is shown to remain accurate for non-line-of-sight (NLoS) scattering problems
- …