110 research outputs found

    A Multi-Scale Electromagnetic Particle Code with Adaptive Mesh Refinement and Its Parallelization

    Get PDF
    AbstractSpace plasma phenomena occur in multi-scale processes from the electron scale to the magnetohydrodynamic scale. In order to investigate such multi-scale phenomena including plasma kinetic effects, we started to develop a new electromagnetic Particle-In-Cell (PIC) code with Adaptive Mesh Refinement (AMR) technique. AMR can realize high-resolution calculation saving computer resources by generating and removing hierarchical cells dynamically. In the parallelization, we adopt domain decomposition method and for good locality preserving and dynamical load balancing, we will use the Morton ordered curve. In the PIC method, particle calculation occupies most of the total calculation time. In our AMR-PIC code, time step intervals are also refined. To realize the load balancing between processes in the domain decomposition scheme, it is the most essential to consider the number of particle calculation loops for each cell among all hierarchical levels as a work weight for each processor. Therefore, we calculate the work weights based on the cost of particle calculation and hierarchical levels of each cell. Then we decompose the domain according to the Morton curve and the work weight, so that each processor has approximately the same amount of work. By performing a simple one-dimensional simulation, we confirmed that the dynamic load balancing is achieved and the computation time is reduced by introducing the dynamic domain decomposition scheme

    Evaluation of an efficient etack-RLE clustering concept for dynamically adaptive grids

    Get PDF
    This is the author accepted manuscript. The final version is available from the Society for Industrial and Applied Mathematics via the DOI in this record.Abstract. One approach to tackle the challenge of efficient implementations for parallel PDE simulations on dynamically changing grids is the usage of space-filling curves (SFC). While SFC algorithms possess advantageous properties such as low memory requirements and close-to-optimal partitioning approaches with linear complexity, they require efficient communication strategies for keeping and utilizing the connectivity information, in particular for dynamically changing grids. Our approach is to use a sparse communication graph to store the connectivity information and to transfer data block-wise. This permits efficient generation of multiple partitions per memory context (denoted by clustering) which - in combination with a run-length encoding (RLE) - directly leads to elegant solutions for shared, distributed and hybrid parallelization and allows cluster-based optimizations. While previous work focused on specific aspects, we present in this paper an overall compact summary of the stack-RLE clustering approach completed by aspects on the vertex-based communication that ease up understanding the approach. The central contribution of this work is the proof of suitability of the stack-RLE clustering approach for an efficient realization of different, relevant building blocks of Scientific Computing methodology and real-life CSE applications: We show 95% strong scalability for small-scale scalability benchmarks on 512 cores and weak scalability of over 90% on 8192 cores for finite-volume solvers and changing grid structure in every time step; optimizations of simulation data backends by writer tasks; comparisons of analytical benchmarks to analyze the adaptivity criteria; and a Tsunami simulation as a representative real-world showcase of a wave propagation for our approach which reduces the overall workload by 95% for parallel fully-adaptive mesh refinement and, based on a comparison with SFC-ordered regular grid cells, reduces the computation time by a factor of 7.6 with improved results and a factor of 62.2 with results of similar accuracy of buoy station dataThis work was partly supported by the German Research Foundation (DFG) as part of the Transregional Collaborative Research Centre “Invasive Computing” (SFB/TR 89)

    The Peano software---parallel, automaton-based, dynamically adaptive grid traversals

    Get PDF
    We discuss the design decisions, design alternatives, and rationale behind the third generation of Peano, a framework for dynamically adaptive Cartesian meshes derived from spacetrees. Peano ties the mesh traversal to the mesh storage and supports only one element-wise traversal order resulting from space-filling curves. The user is not free to choose a traversal order herself. The traversal can exploit regular grid subregions and shared memory as well as distributed memory systems with almost no modifications to a serial application code. We formalize the software design by means of two interacting automata—one automaton for the multiscale grid traversal and one for the application-specific algorithmic steps. This yields a callback-based programming paradigm. We further sketch the supported application types and the two data storage schemes realized before we detail high-performance computing aspects and lessons learned. Special emphasis is put on observations regarding the used programming idioms and algorithmic concepts. This transforms our report from a “one way to implement things” code description into a generic discussion and summary of some alternatives, rationale, and design decisions to be made for any tree-based adaptive mesh refinement software

    The Magnetohydrodynamic-Particle-In-Cell Module in Athena++: Implementation and Code Tests

    Full text link
    We present a new magnetohydrodynamic-particle-in-cell (MHD-PIC) code integrated into the Athena++ framework. It treats energetic particles as in conventional PIC codes while the rest of thermal plasmas are treated as background fluid described by MHD, thus primarily targeting at multi-scale astrophysical problems involving the kinetic physics of the cosmic-rays (CRs). The code is optimized toward efficient vectorization in interpolation and particle deposits, with excellent parallel scaling. The code is also compatible with static/adaptive mesh refinement, with dynamic load balancing to further enhance multi-scale simulations. In addition, we have implemented a compressing/expanding box framework which allows adiabatic driving of CR pressure anisotropy, as well as the δf\delta f method that can dramatically reduce Poisson noise in problems where distribution function ff is only expected to slightly deviate from the background. The code performance is demonstrated over a series of benchmark test problems including particle acceleration in non-relativistic parallel shocks. In particular, we reproduce the linear growth of the CR gyro-resonant (streaming and pressure anisotropy) instabilities, under both the periodic and expanding/compressing box setting. We anticipate the code to open up the avenue for a wide range of astrophysical and plasma physics applications.Comment: 20 pages, 19 figures, submitted to MNRA
    • …
    corecore