7,392 research outputs found
Reducing memory requirements for large size LBM simulations on GPUs
The scientific community in its never-ending road of larger and more efficient computational resources is in need of more efficient implementations that can adapt efficiently on the current parallel platforms. Graphics processing units are an appropriate platform that cover some of these demands. This architecture presents a high performance with a reduced cost and an efficient power consumption. However, the memory capacity in these devices is reduced and so expensive memory transfers are necessary to deal with big problems. Today, the lattice-Boltzmann method (LBM) has positioned as an efficient approach for Computational Fluid Dynamics simulations. Despite this method is particularly amenable to be efficiently parallelized, it is in need of a considerable memory capacity, which is the consequence of a dramatic fall in performance when dealing with large simulations. In this work, we propose some initiatives to minimize such demand of memory, which allows us to execute bigger simulations on the same platform without additional memory transfers, keeping a high performance. In particular, we present 2 new implementations, LBM-Ghost and LBM-Swap, which are deeply analyzed, presenting the pros and cons of each of them.This project was funded by the Spanish Ministry of Economy and Competitiveness (MINECO): BCAM Severo Ochoa accreditation SEV-2013-0323, MTM2013-40824, Computación de Altas Prestaciones VII TIN2015-65316-P, by the Basque Excellence Research Center (BERC 2014-2017) pro-
gram by the Basque Government, and by the Departament d' Innovació, Universitats i Empresa de la Generalitat de Catalunya, under project MPEXPAR: Models de Programació i Entorns d' Execució Paral·lels (2014-SGR-1051). We also thank the support of the computing facilities of Extremadura Research Centre for Advanced Technologies (CETA-CIEMAT) and NVIDIA GPU Research Center program for the provided resources,
as well as the support of NVIDIA through the BSC/UPC NVIDIA GPU Center of Excellence.Peer ReviewedPostprint (author's final draft
A Flexible Patch-Based Lattice Boltzmann Parallelization Approach for Heterogeneous GPU-CPU Clusters
Sustaining a large fraction of single GPU performance in parallel
computations is considered to be the major problem of GPU-based clusters. In
this article, this topic is addressed in the context of a lattice Boltzmann
flow solver that is integrated in the WaLBerla software framework. We propose a
multi-GPU implementation using a block-structured MPI parallelization, suitable
for load balancing and heterogeneous computations on CPUs and GPUs. The
overhead required for multi-GPU simulations is discussed in detail and it is
demonstrated that the kernel performance can be sustained to a large extent.
With our GPU implementation, we achieve nearly perfect weak scalability on
InfiniBand clusters. However, in strong scaling scenarios multi-GPUs make less
efficient use of the hardware than IBM BG/P and x86 clusters. Hence, a cost
analysis must determine the best course of action for a particular simulation
task. Additionally, weak scaling results of heterogeneous simulations conducted
on CPUs and GPUs simultaneously are presented using clusters equipped with
varying node configurations.Comment: 20 pages, 12 figure
Large-scale lattice Boltzmann simulations of complex fluids: advances through the advent of computational grids
During the last two years the RealityGrid project has allowed us to be one of
the few scientific groups involved in the development of computational grids.
Since smoothly working production grids are not yet available, we have been
able to substantially influence the direction of software development and grid
deployment within the project. In this paper we review our results from large
scale three-dimensional lattice Boltzmann simulations performed over the last
two years. We describe how the proactive use of computational steering and
advanced job migration and visualization techniques enabled us to do our
scientific work more efficiently. The projects reported on in this paper are
studies of complex fluid flows under shear or in porous media, as well as
large-scale parameter searches, and studies of the self-organisation of liquid
cubic mesophases.
Movies are available at
http://www.ica1.uni-stuttgart.de/~jens/pub/05/05-PhilTransReview.htmlComment: 18 pages, 9 figures, 4 movies available, accepted for publication in
Phil. Trans. R. Soc. London Series
Real Time Wake Computations using Lattice Boltzmann Method on Many Integrated Core Processors
This paper puts forward an efficient Lattice Boltzmann method for use as a wake simulator suitable for
real-time environments. The method is limited to low speed incompressible flow but is very efficient and
can be used to compute flows “on the fly”. In particular, many-core machines allow for the method to be
used with the need of very expensive parallel clusters. Results are shown here for flows around
cylinders and simple ship shapes
Real Time Wake Computations using Lattice Boltzmann Method on Many Integrated Core Processors
This paper puts forward an efficient Lattice Boltzmann method for use as a wake simulator suitable for
real-time environments. The method is limited to low speed incompressible flow but is very efficient and
can be used to compute flows “on the fly”. In particular, many-core machines allow for the method to be
used with the need of very expensive parallel clusters. Results are shown here for flows around
cylinders and simple ship shapes
- …