7,392 research outputs found

    Reducing memory requirements for large size LBM simulations on GPUs

    Get PDF
    The scientific community in its never-ending road of larger and more efficient computational resources is in need of more efficient implementations that can adapt efficiently on the current parallel platforms. Graphics processing units are an appropriate platform that cover some of these demands. This architecture presents a high performance with a reduced cost and an efficient power consumption. However, the memory capacity in these devices is reduced and so expensive memory transfers are necessary to deal with big problems. Today, the lattice-Boltzmann method (LBM) has positioned as an efficient approach for Computational Fluid Dynamics simulations. Despite this method is particularly amenable to be efficiently parallelized, it is in need of a considerable memory capacity, which is the consequence of a dramatic fall in performance when dealing with large simulations. In this work, we propose some initiatives to minimize such demand of memory, which allows us to execute bigger simulations on the same platform without additional memory transfers, keeping a high performance. In particular, we present 2 new implementations, LBM-Ghost and LBM-Swap, which are deeply analyzed, presenting the pros and cons of each of them.This project was funded by the Spanish Ministry of Economy and Competitiveness (MINECO): BCAM Severo Ochoa accreditation SEV-2013-0323, MTM2013-40824, Computación de Altas Prestaciones VII TIN2015-65316-P, by the Basque Excellence Research Center (BERC 2014-2017) pro- gram by the Basque Government, and by the Departament d' Innovació, Universitats i Empresa de la Generalitat de Catalunya, under project MPEXPAR: Models de Programació i Entorns d' Execució Paral·lels (2014-SGR-1051). We also thank the support of the computing facilities of Extremadura Research Centre for Advanced Technologies (CETA-CIEMAT) and NVIDIA GPU Research Center program for the provided resources, as well as the support of NVIDIA through the BSC/UPC NVIDIA GPU Center of Excellence.Peer ReviewedPostprint (author's final draft

    A Flexible Patch-Based Lattice Boltzmann Parallelization Approach for Heterogeneous GPU-CPU Clusters

    Full text link
    Sustaining a large fraction of single GPU performance in parallel computations is considered to be the major problem of GPU-based clusters. In this article, this topic is addressed in the context of a lattice Boltzmann flow solver that is integrated in the WaLBerla software framework. We propose a multi-GPU implementation using a block-structured MPI parallelization, suitable for load balancing and heterogeneous computations on CPUs and GPUs. The overhead required for multi-GPU simulations is discussed in detail and it is demonstrated that the kernel performance can be sustained to a large extent. With our GPU implementation, we achieve nearly perfect weak scalability on InfiniBand clusters. However, in strong scaling scenarios multi-GPUs make less efficient use of the hardware than IBM BG/P and x86 clusters. Hence, a cost analysis must determine the best course of action for a particular simulation task. Additionally, weak scaling results of heterogeneous simulations conducted on CPUs and GPUs simultaneously are presented using clusters equipped with varying node configurations.Comment: 20 pages, 12 figure

    Large-scale lattice Boltzmann simulations of complex fluids: advances through the advent of computational grids

    Get PDF
    During the last two years the RealityGrid project has allowed us to be one of the few scientific groups involved in the development of computational grids. Since smoothly working production grids are not yet available, we have been able to substantially influence the direction of software development and grid deployment within the project. In this paper we review our results from large scale three-dimensional lattice Boltzmann simulations performed over the last two years. We describe how the proactive use of computational steering and advanced job migration and visualization techniques enabled us to do our scientific work more efficiently. The projects reported on in this paper are studies of complex fluid flows under shear or in porous media, as well as large-scale parameter searches, and studies of the self-organisation of liquid cubic mesophases. Movies are available at http://www.ica1.uni-stuttgart.de/~jens/pub/05/05-PhilTransReview.htmlComment: 18 pages, 9 figures, 4 movies available, accepted for publication in Phil. Trans. R. Soc. London Series

    Real Time Wake Computations using Lattice Boltzmann Method on Many Integrated Core Processors

    Get PDF
    This paper puts forward an efficient Lattice Boltzmann method for use as a wake simulator suitable for real-time environments. The method is limited to low speed incompressible flow but is very efficient and can be used to compute flows “on the fly”. In particular, many-core machines allow for the method to be used with the need of very expensive parallel clusters. Results are shown here for flows around cylinders and simple ship shapes

    Real Time Wake Computations using Lattice Boltzmann Method on Many Integrated Core Processors

    Get PDF
    This paper puts forward an efficient Lattice Boltzmann method for use as a wake simulator suitable for real-time environments. The method is limited to low speed incompressible flow but is very efficient and can be used to compute flows “on the fly”. In particular, many-core machines allow for the method to be used with the need of very expensive parallel clusters. Results are shown here for flows around cylinders and simple ship shapes
    corecore