497 research outputs found
A Flexible Patch-Based Lattice Boltzmann Parallelization Approach for Heterogeneous GPU-CPU Clusters
Sustaining a large fraction of single GPU performance in parallel
computations is considered to be the major problem of GPU-based clusters. In
this article, this topic is addressed in the context of a lattice Boltzmann
flow solver that is integrated in the WaLBerla software framework. We propose a
multi-GPU implementation using a block-structured MPI parallelization, suitable
for load balancing and heterogeneous computations on CPUs and GPUs. The
overhead required for multi-GPU simulations is discussed in detail and it is
demonstrated that the kernel performance can be sustained to a large extent.
With our GPU implementation, we achieve nearly perfect weak scalability on
InfiniBand clusters. However, in strong scaling scenarios multi-GPUs make less
efficient use of the hardware than IBM BG/P and x86 clusters. Hence, a cost
analysis must determine the best course of action for a particular simulation
task. Additionally, weak scaling results of heterogeneous simulations conducted
on CPUs and GPUs simultaneously are presented using clusters equipped with
varying node configurations.Comment: 20 pages, 12 figure
Lattice Boltzmann modeling for shallow water equations using high performance computing
The aim of this dissertation project is to extend the standard Lattice Boltzmann method (LBM) for shallow water flows in order to deal with three dimensional flow fields. The shallow water and mass transport equations have wide applications in ocean, coastal, and hydraulic engineering, which can benefit from the advantages of the LBM. The LBM has recently become an attractive numerical method to solve various fluid dynamics phenomena; however, it has not been extensively applied to modeling shallow water flow and mass transport. Only a few works can be found on improving the LBM for mass transport in shallow water flows and even fewer on extending it to model three dimensional shallow water flow fields. The application of the LBM to modeling the shallow water and mass transport equations has been limited because it is not clearly understood how the LBM solves the shallow water and mass transport equations. The project first focuses on studying the importance of choosing enhanced collision operators such as the multiple-relaxation-time (MRT) and two-relaxation-time (TRT) over the standard single-relaxation-time (SRT) in LBM. A (MRT) collision operator is chosen for the shallow water equations, while a (TRT) method is used for the advection-dispersion equation. Furthermore, two speed-of-sound techniques are introduced to account for heterogeneous and anisotropic dispersion coefficients. By selecting appropriate equilibrium distribution functions, the standard LBM is extended to solve three-dimensional wind-driven and density-driven circulation by introducing a multi-layer LB model. A MRT-LBM model is used to solve for each layer coupled by the vertical viscosity forcing term. To increase solution stability, an implicit step is suggested to obtain stratified flow velocities. Numerical examples are presented to verify the multi-layer LB model against analytical solutions. The model’s capability of calculating lateral and vertical distributions of the horizontal velocities is demonstrated for wind- and density- driven circulation over non-uniform bathymetry. The parallel performance of the LBM on central processing unit (CPU) based and graphics processing unit (GPU) based high performance computing (HPC) architectures is investigated showing attractive performance in relation to speedup and scalability
Reducing memory requirements for large size LBM simulations on GPUs
The scientific community in its never-ending road of larger and more efficient computational resources is in need of more efficient implementations that can adapt efficiently on the current parallel platforms. Graphics processing units are an appropriate platform that cover some of these demands. This architecture presents a high performance with a reduced cost and an efficient power consumption. However, the memory capacity in these devices is reduced and so expensive memory transfers are necessary to deal with big problems. Today, the lattice-Boltzmann method (LBM) has positioned as an efficient approach for Computational Fluid Dynamics simulations. Despite this method is particularly amenable to be efficiently parallelized, it is in need of a considerable memory capacity, which is the consequence of a dramatic fall in performance when dealing with large simulations. In this work, we propose some initiatives to minimize such demand of memory, which allows us to execute bigger simulations on the same platform without additional memory transfers, keeping a high performance. In particular, we present 2 new implementations, LBM-Ghost and LBM-Swap, which are deeply analyzed, presenting the pros and cons of each of them.This project was funded by the Spanish Ministry of Economy and Competitiveness (MINECO): BCAM Severo Ochoa accreditation SEV-2013-0323, MTM2013-40824, Computación de Altas Prestaciones VII TIN2015-65316-P, by the Basque Excellence Research Center (BERC 2014-2017) pro-
gram by the Basque Government, and by the Departament d' Innovació, Universitats i Empresa de la Generalitat de Catalunya, under project MPEXPAR: Models de Programació i Entorns d' Execució Paral·lels (2014-SGR-1051). We also thank the support of the computing facilities of Extremadura Research Centre for Advanced Technologies (CETA-CIEMAT) and NVIDIA GPU Research Center program for the provided resources,
as well as the support of NVIDIA through the BSC/UPC NVIDIA GPU Center of Excellence.Peer ReviewedPostprint (author's final draft
Wall Orientation and Shear Stress in the Lattice Boltzmann Model
The wall shear stress is a quantity of profound importance for clinical
diagnosis of artery diseases. The lattice Boltzmann is an easily parallelizable
numerical method of solving the flow problems, but it suffers from errors of
the velocity field near the boundaries which leads to errors in the wall shear
stress and normal vectors computed from the velocity. In this work we present a
simple formula to calculate the wall shear stress in the lattice Boltzmann
model and propose to compute wall normals, which are necessary to compute the
wall shear stress, by taking the weighted mean over boundary facets lying in a
vicinity of a wall element. We carry out several tests and observe an increase
of accuracy of computed normal vectors over other methods in two and three
dimensions. Using the scheme we compute the wall shear stress in an inclined
and bent channel fluid flow and show a minor influence of the normal on the
numerical error, implying that that the main error arises due to a corrupted
velocity field near the staircase boundary. Finally, we calculate the wall
shear stress in the human abdominal aorta in steady conditions using our method
and compare the results with a standard finite volume solver and experimental
data available in the literature. Applications of our ideas in a simplified
protocol for data preprocessing in medical applications are discussed.Comment: 9 pages, 11 figure
- …