7,006 research outputs found

    Achieving High Speed CFD simulations: Optimization, Parallelization, and FPGA Acceleration for the unstructured DLR TAU Code

    Get PDF
    Today, large scale parallel simulations are fundamental tools to handle complex problems. The number of processors in current computation platforms has been recently increased and therefore it is necessary to optimize the application performance and to enhance the scalability of massively-parallel systems. In addition, new heterogeneous architectures, combining conventional processors with specific hardware, like FPGAs, to accelerate the most time consuming functions are considered as a strong alternative to boost the performance. In this paper, the performance of the DLR TAU code is analyzed and optimized. The improvement of the code efficiency is addressed through three key activities: Optimization, parallelization and hardware acceleration. At first, a profiling analysis of the most time-consuming processes of the Reynolds Averaged Navier Stokes flow solver on a three-dimensional unstructured mesh is performed. Then, a study of the code scalability with new partitioning algorithms are tested to show the most suitable partitioning algorithms for the selected applications. Finally, a feasibility study on the application of FPGAs and GPUs for the hardware acceleration of CFD simulations is presented

    A GPU-enabled implicit Finite Volume solver for the ideal two-fluid plasma model on unstructured grids

    Full text link
    This paper describes the main features of a pioneering unsteady solver for simulating ideal two-fluid plasmas on unstructured grids, taking profit of GPGPU (General-purpose computing on graphics processing units). The code, which has been implemented within the open source COOLFluiD platform, is implicit, second-order in time and space, relying upon a Finite Volume method for the spatial discretization and a three-point backward Euler for the time integration. In particular, the convective fluxes are computed by a multi-fluid version of the AUSM+up scheme for the plasma equations, in combination with a modified Rusanov scheme with tunable dissipation for the Maxwell equations. Source terms are integrated with a one-point rule, using the cell-centered value. Some critical aspects of the porting to GPU's are discussed, as well as the performance of two open source linear system solvers (i.e. PETSc, PARALUTION). The code design allows for computing both flux and source terms on the GPU along with their Jacobian, giving a noticeable decrease in the computational time in comparison with the original CPU-based solver. The code has been tested in a wide range of mesh sizes and in three different systems, each one with a different GPU. The increased performance (up to 14x) is demonstrated in two representative 2D benchmarks: propagation of circularly polarized waves and the more challenging Geospace Environmental Modeling (GEM) magnetic reconnection challenge.Comment: 22 pages, 7 figure
    corecore