4 research outputs found

    Alternating direction implicit time integrations for finite difference acoustic wave propagation: Parallelization and convergence

    Full text link
    This work studies the parallelization and empirical convergence of two finite difference acoustic wave propagation methods on 2-D rectangular grids, that use the same alternating direction implicit (ADI) time integration. This ADI integration is based on a second-order implicit Crank-Nicolson temporal discretization that is factored out by a Peaceman-Rachford decomposition of the time and space equation terms. In space, these methods highly diverge and apply different fourth-order accurate differentiation techniques. The first method uses compact finite differences (CFD) on nodal meshes that requires solving tridiagonal linear systems along each grid line, while the second one employs staggered-grid mimetic finite differences (MFD). For each method, we implement three parallel versions: (i) a multithreaded code in Octave, (ii) a C++ code that exploits OpenMP loop parallelization, and (iii) a CUDA kernel for a NVIDIA GTX 960 Maxwell card. In these implementations, the main source of parallelism is the simultaneous ADI updating of each wave field matrix, either column-wise or row-wise, according to the differentiation direction. In our numerical applications, the highest performances are displayed by the CFD and MFD CUDA codes that achieve speedups of 7.21x and 15.81x, respectively, relative to their C++ sequential counterparts with optimal compilation flags. Our test cases also allow to assess the numerical convergence and accuracy of both methods. In a problem with exact harmonic solution, both methods exhibit convergence rates close to 4 and the MDF accuracy is practically higher. Alternatively, both convergences decay to second order on smooth problems with severe gradients at boundaries, and the MDF rates degrade in highly-resolved grids leading to larger inaccuracies. This transition of empirical convergences agrees with the nominal truncation errors in space and time.Comment: 20 pages, 5 figure

    A novel approach to evaluating compact finite differences and similar tridiagonal schemes on GPU-accelerated clusters

    Get PDF
    Compact finite difference schemes are widely used in the direct numerical simulation of fluid flows for their ability to better resolve the small scales of turbulence. However, they can be expensive to evaluate and difficult to parallelize. In this work, we present an approach for the computation of compact finite differences and similar tridiagonal schemes on graphics processing units (GPUs). We present a variant of the cyclic reduction algorithm for solving the tridiagonal linear systems that arise in such numerical schemes. We study the impact of the matrix structure on the cyclic reduction algorithm and show that precomputing forward reduction coefficients can be especially effective for obtaining good performance. Our tridiagonal solver is able to outperform the NVIDIA CUSPARSE and the multithreaded Intel MKL tridiagonal solvers on GPU and CPU respectively. In addition, we present a parallelization strategy for GPU-accelerated clusters, and show scalabality of a 3-D compact finite difference application for up to 64 GPUs on Clemson’s Palmetto cluster
    corecore