3,159 research outputs found

    Development of Fast Algorithms Using Recursion, Nesting and Iterations for Computational Electromagnetics

    Get PDF
    In the first phase of our work, we have concentrated on laying the foundation to develop fast algorithms, including the use of recursive structure like the recursive aggregate interaction matrix algorithm (RAIMA), the nested equivalence principle algorithm (NEPAL), the ray-propagation fast multipole algorithm (RPFMA), and the multi-level fast multipole algorithm (MLFMA). We have also investigated the use of curvilinear patches to build a basic method of moments code where these acceleration techniques can be used later. In the second phase, which is mainly reported on here, we have concentrated on implementing three-dimensional NEPAL on a massively parallel machine, the Connection Machine CM-5, and have been able to obtain some 3D scattering results. In order to understand the parallelization of codes on the Connection Machine, we have also studied the parallelization of 3D finite-difference time-domain (FDTD) code with PML material absorbing boundary condition (ABC). We found that simple algorithms like the FDTD with material ABC can be parallelized very well allowing us to solve within a minute a problem of over a million nodes. In addition, we have studied the use of the fast multipole method and the ray-propagation fast multipole algorithm to expedite matrix-vector multiplication in a conjugate-gradient solution to integral equations of scattering. We find that these methods are faster than LU decomposition for one incident angle, but are slower than LU decomposition when many incident angles are needed as in the monostatic RCS calculations

    A pilgrimage to gravity on GPUs

    Get PDF
    In this short review we present the developments over the last 5 decades that have led to the use of Graphics Processing Units (GPUs) for astrophysical simulations. Since the introduction of NVIDIA's Compute Unified Device Architecture (CUDA) in 2007 the GPU has become a valuable tool for N-body simulations and is so popular these days that almost all papers about high precision N-body simulations use methods that are accelerated by GPUs. With the GPU hardware becoming more advanced and being used for more advanced algorithms like gravitational tree-codes we see a bright future for GPU like hardware in computational astrophysics.Comment: To appear in: European Physical Journal "Special Topics" : "Computer Simulations on Graphics Processing Units" . 18 pages, 8 figure

    A fast recursive coordinate bisection tree for neighbour search and gravity

    Full text link
    We introduce our new binary tree code for neighbour search and gravitational force calculations in an N-particle system. The tree is built in a "top-down" fashion by "recursive coordinate bisection" where on each tree level we split the longest side of a cell through its centre of mass. This procedure continues until the average number of particles in the lowest tree level has dropped below a prescribed value. To calculate the forces on the particles in each lowest-level cell we split the gravitational interaction into a near- and a far-field. Since our main intended applications are SPH simulations, we calculate the near-field by a direct, kernel-smoothed summation, while the far field is evaluated via a Cartesian Taylor expansion up to quadrupole order. Instead of applying the far-field approach for each particle separately, we use another Taylor expansion around the centre of mass of each lowest-level cell to determine the forces at the particle positions. Due to this "cell-cell interaction" the code performance is close to O(N) where N is the number of used particles. We describe in detail various technicalities that ensure a low memory footprint and an efficient cache use. In a set of benchmark tests we scrutinize our new tree and compare it to the "Press tree" that we have previously made ample use of. At a slightly higher force accuracy than the Press tree, our tree turns out to be substantially faster and increasingly more so for larger particle numbers. For four million particles our tree build is faster by a factor of 25 and the time for neighbour search and gravity is reduced by more than a factor of 6. In single processor tests with up to 10^8 particles we confirm experimentally that the scaling behaviour is close to O(N). The current Fortran 90 code version is OpenMP-parallel and scales excellently with the processor number (=24) of our test machine.Comment: 12 pages, 16 figures, 1 table, accepted for publication in MNRAS on July 28, 201

    A hybrid MLFMM-UTD method for the solution of very large 2-D electromagnetic problems

    Get PDF
    The multilevel fast multipole method (MLFMM) is combined with the uniform theory of diffraction (UTD) to model two-dimensional (2-D) scattering problems including very large scatterers. The discretization of the very large scatterers is avoided by using ray-based methods. Reflections are accounted for by image source theory, while for diffraction a new MLFMM translation matrix is introduced. The translation matrix elements are derived based on a technique that generalizes the use of UTD for arbitrary source configurations and that efficiently describes the field over extended regions of space. O(n) scaling of the computational time and memory requirements is achieved for relevant structures, such as large antenna arrays in the presence of a wedge. The theory is validated by means of several illustrative numerical examples and is shown to remain accurate for non-line-of-sight (NLoS) scattering problems
    corecore