172 research outputs found

    An Improved Multi-Stage Preconditioner on GPUs for Compositional Reservoir Simulation

    Full text link
    The compositional model is often used to describe multicomponent multiphase porous media flows in the petroleum industry. The fully implicit method with strong stability and weak constraints on time-step sizes is commonly used in the mainstream commercial reservoir simulators. In this paper, we develop an efficient multi-stage preconditioner for the fully implicit compositional flow simulation. The method employs an adaptive setup phase to improve the parallel efficiency on GPUs. Furthermore, a multi-color Gauss-Seidel algorithm based on the adjacency matrix is applied in the algebraic multigrid methods for the pressure part. Numerical results demonstrate that the proposed algorithm achieves good parallel speedup while yields the same convergence behavior as the corresponding sequential version.Comment: 24 pages, 4 figures, and 8 tables. arXiv admin note: text overlap with arXiv:2201.0197

    An Evaluation and Comparison of GPU Hardware and Solver Libraries for Accelerating the OPM Flow Reservoir Simulator

    Full text link
    Realistic reservoir simulation is known to be prohibitively expensive in terms of computation time when increasing the accuracy of the simulation or by enlarging the model grid size. One method to address this issue is to parallelize the computation by dividing the model in several partitions and using multiple CPUs to compute the result using techniques such as MPI and multi-threading. Alternatively, GPUs are also a good candidate to accelerate the computation due to their massively parallel architecture that allows many floating point operations per second to be performed. The numerical iterative solver takes thus the most computational time and is challenging to solve efficiently due to the dependencies that exist in the model between cells. In this work, we evaluate the OPM Flow simulator and compare several state-of-the-art GPU solver libraries as well as custom developed solutions for a BiCGStab solver using an ILU0 preconditioner and benchmark their performance against the default DUNE library implementation running on multiple CPU processors using MPI. The evaluated GPU software libraries include a manual linear solver in OpenCL and the integration of several third party sparse linear algebra libraries, such as cuSparse, rocSparse, and amgcl. To perform our bench-marking, we use small, medium, and large use cases, starting with the public test case NORNE that includes approximately 50k active cells and ending with a large model that includes approximately 1 million active cells. We find that a GPU can accelerate a single dual-threaded MPI process up to 5.6 times, and that it can compare with around 8 dual-threaded MPI processes

    A GPU-accelerated package for simulation of flow in nanoporous source rocks with many-body dissipative particle dynamics

    Full text link
    Mesoscopic simulations of hydrocarbon flow in source shales are challenging, in part due to the heterogeneous shale pores with sizes ranging from a few nanometers to a few micrometers. Additionally, the sub-continuum fluid-fluid and fluid-solid interactions in nano- to micro-scale shale pores, which are physically and chemically sophisticated, must be captured. To address those challenges, we present a GPU-accelerated package for simulation of flow in nano- to micro-pore networks with a many-body dissipative particle dynamics (mDPD) mesoscale model. Based on a fully distributed parallel paradigm, the code offloads all intensive workloads on GPUs. Other advancements, such as smart particle packing and no-slip boundary condition in complex pore geometries, are also implemented for the construction and the simulation of the realistic shale pores from 3D nanometer-resolution stack images. Our code is validated for accuracy and compared against the CPU counterpart for speedup. In our benchmark tests, the code delivers nearly perfect strong scaling and weak scaling (with up to 512 million particles) on up to 512 K20X GPUs on Oak Ridge National Laboratory's (ORNL) Titan supercomputer. Moreover, a single-GPU benchmark on ORNL's SummitDev and IBM's AC922 suggests that the host-to-device NVLink can boost performance over PCIe by a remarkable 40\%. Lastly, we demonstrate, through a flow simulation in realistic shale pores, that the CPU counterpart requires 840 Power9 cores to rival the performance delivered by our package with four V100 GPUs on ORNL's Summit architecture. This simulation package enables quick-turnaround and high-throughput mesoscopic numerical simulations for investigating complex flow phenomena in nano- to micro-porous rocks with realistic pore geometries

    Parallel Numerical Solution of Two-Phase Flow in Porous Media On Non-Orthogonal Geometries: a Performance Study Using Different GPU Architectures

    Get PDF
    A parallel numerical model for two phase flow (water and oil) in porous media on nonorthogonal geometries is solved by using different Graphics Processing Unit (GPU) architectures to carry out a comparison of the performance that can be reached by each of them. The mathematical model is based on the mass conservation transformed equations for water and oil phases, which results in two coupled non-linear partial differential equations (PDEs). The Finite Volume Method (FVM) is used to discretize the set of PDEs that govern this problem and the Newton-Raphson method is utilized to linearize and solve them simultaneously. Solution of the linear equations system is computationally expensive and requires a large amount of time as the number of unknowns increases. We take advantage of the current GPUs computing technology for constructing massive parallel numerical algorithms for modeling multi-phase flow in porous media [1, 2]. The construction of the Jacobian is directly done in the GPU, which reduces the information that needs to be exchanged between the CPU (Central Processing Unit) and the GPU. Libraries that include Krylov methods are used and tested. The numerical results indicate until 12x of speed up over a single CPU by applying the GPU parallelism with the different architectures tested in this study (Kepler, Pascal and Turing). Furthermore, this study also tries to identify which of these architectures is the best option according to our computing needs

    MASSIVELY PARALLEL OIL RESERVOIR SIMULATION FOR HISTORY MATCHING

    Get PDF
    corecore