497 research outputs found

    A Flexible Patch-Based Lattice Boltzmann Parallelization Approach for Heterogeneous GPU-CPU Clusters

    Full text link
    Sustaining a large fraction of single GPU performance in parallel computations is considered to be the major problem of GPU-based clusters. In this article, this topic is addressed in the context of a lattice Boltzmann flow solver that is integrated in the WaLBerla software framework. We propose a multi-GPU implementation using a block-structured MPI parallelization, suitable for load balancing and heterogeneous computations on CPUs and GPUs. The overhead required for multi-GPU simulations is discussed in detail and it is demonstrated that the kernel performance can be sustained to a large extent. With our GPU implementation, we achieve nearly perfect weak scalability on InfiniBand clusters. However, in strong scaling scenarios multi-GPUs make less efficient use of the hardware than IBM BG/P and x86 clusters. Hence, a cost analysis must determine the best course of action for a particular simulation task. Additionally, weak scaling results of heterogeneous simulations conducted on CPUs and GPUs simultaneously are presented using clusters equipped with varying node configurations.Comment: 20 pages, 12 figure

    Lattice Boltzmann modeling for shallow water equations using high performance computing

    Get PDF
    The aim of this dissertation project is to extend the standard Lattice Boltzmann method (LBM) for shallow water flows in order to deal with three dimensional flow fields. The shallow water and mass transport equations have wide applications in ocean, coastal, and hydraulic engineering, which can benefit from the advantages of the LBM. The LBM has recently become an attractive numerical method to solve various fluid dynamics phenomena; however, it has not been extensively applied to modeling shallow water flow and mass transport. Only a few works can be found on improving the LBM for mass transport in shallow water flows and even fewer on extending it to model three dimensional shallow water flow fields. The application of the LBM to modeling the shallow water and mass transport equations has been limited because it is not clearly understood how the LBM solves the shallow water and mass transport equations. The project first focuses on studying the importance of choosing enhanced collision operators such as the multiple-relaxation-time (MRT) and two-relaxation-time (TRT) over the standard single-relaxation-time (SRT) in LBM. A (MRT) collision operator is chosen for the shallow water equations, while a (TRT) method is used for the advection-dispersion equation. Furthermore, two speed-of-sound techniques are introduced to account for heterogeneous and anisotropic dispersion coefficients. By selecting appropriate equilibrium distribution functions, the standard LBM is extended to solve three-dimensional wind-driven and density-driven circulation by introducing a multi-layer LB model. A MRT-LBM model is used to solve for each layer coupled by the vertical viscosity forcing term. To increase solution stability, an implicit step is suggested to obtain stratified flow velocities. Numerical examples are presented to verify the multi-layer LB model against analytical solutions. The model’s capability of calculating lateral and vertical distributions of the horizontal velocities is demonstrated for wind- and density- driven circulation over non-uniform bathymetry. The parallel performance of the LBM on central processing unit (CPU) based and graphics processing unit (GPU) based high performance computing (HPC) architectures is investigated showing attractive performance in relation to speedup and scalability

    Reducing memory requirements for large size LBM simulations on GPUs

    Get PDF
    The scientific community in its never-ending road of larger and more efficient computational resources is in need of more efficient implementations that can adapt efficiently on the current parallel platforms. Graphics processing units are an appropriate platform that cover some of these demands. This architecture presents a high performance with a reduced cost and an efficient power consumption. However, the memory capacity in these devices is reduced and so expensive memory transfers are necessary to deal with big problems. Today, the lattice-Boltzmann method (LBM) has positioned as an efficient approach for Computational Fluid Dynamics simulations. Despite this method is particularly amenable to be efficiently parallelized, it is in need of a considerable memory capacity, which is the consequence of a dramatic fall in performance when dealing with large simulations. In this work, we propose some initiatives to minimize such demand of memory, which allows us to execute bigger simulations on the same platform without additional memory transfers, keeping a high performance. In particular, we present 2 new implementations, LBM-Ghost and LBM-Swap, which are deeply analyzed, presenting the pros and cons of each of them.This project was funded by the Spanish Ministry of Economy and Competitiveness (MINECO): BCAM Severo Ochoa accreditation SEV-2013-0323, MTM2013-40824, Computación de Altas Prestaciones VII TIN2015-65316-P, by the Basque Excellence Research Center (BERC 2014-2017) pro- gram by the Basque Government, and by the Departament d' Innovació, Universitats i Empresa de la Generalitat de Catalunya, under project MPEXPAR: Models de Programació i Entorns d' Execució Paral·lels (2014-SGR-1051). We also thank the support of the computing facilities of Extremadura Research Centre for Advanced Technologies (CETA-CIEMAT) and NVIDIA GPU Research Center program for the provided resources, as well as the support of NVIDIA through the BSC/UPC NVIDIA GPU Center of Excellence.Peer ReviewedPostprint (author's final draft

    Wall Orientation and Shear Stress in the Lattice Boltzmann Model

    Full text link
    The wall shear stress is a quantity of profound importance for clinical diagnosis of artery diseases. The lattice Boltzmann is an easily parallelizable numerical method of solving the flow problems, but it suffers from errors of the velocity field near the boundaries which leads to errors in the wall shear stress and normal vectors computed from the velocity. In this work we present a simple formula to calculate the wall shear stress in the lattice Boltzmann model and propose to compute wall normals, which are necessary to compute the wall shear stress, by taking the weighted mean over boundary facets lying in a vicinity of a wall element. We carry out several tests and observe an increase of accuracy of computed normal vectors over other methods in two and three dimensions. Using the scheme we compute the wall shear stress in an inclined and bent channel fluid flow and show a minor influence of the normal on the numerical error, implying that that the main error arises due to a corrupted velocity field near the staircase boundary. Finally, we calculate the wall shear stress in the human abdominal aorta in steady conditions using our method and compare the results with a standard finite volume solver and experimental data available in the literature. Applications of our ideas in a simplified protocol for data preprocessing in medical applications are discussed.Comment: 9 pages, 11 figure
    corecore