19 research outputs found

    Leveraging the Performance of LBM-HPC for Large Sizes on GPUs using Ghost Cells

    Get PDF
    Today, we are living a growing demand of larger and more efficient computational resources from the scienti c community. On the other hand, the appearance of GPUs for general purpose computing supposed an important advance for covering such demand. These devices o er an impressive computational capacity at low cost and an efficient power consumption. However, the memory available in these devices is (sometimes) not enough, and so it is necessary computationally expensive memory transfers from (to) CPU to (from) GPU, causing a dramatic fall in performance. Recently, the Lattice-Boltzmann Method has positioned as an e cient methodology for fluid simulations. Although this method presents some interesting features particularly amenable to be efficiently exploited on parallel computers, it requires a considerable memory capacity, which can suppose an important drawback, in particular, on GPUs. In the present paper, it is proposed a new GPU-based implementation, which minimizes such requirements with respect to other state-of-the-art implementations. It allows us to execute almost 2 bigger problems without additional memory transfers, achieving faster executions when dealing with large problems

    Leveraging the performance of LBM-HPC for large sizes on GPUs using ghost cells

    Get PDF
    Today, we are living a growing demand of larger and more efficient computational resources from the scientific community. On the other hand, the appearance of GPUs for general purpose computing supposed an important advance for covering such demand. These devices offer an impressive computational capacity at low cost and an efficient power consumption. However, the memory available in these devices is (sometimes) not enough, and so it is necessary computationally expensive memory transfers from (to) CPU to (from) GPU, causing a dramatic fall in performance. Recently, the Lattice-Boltzmann Method has positioned as an efficient methodology for fluid simulations. Although this method presents some interesting features particularly amenable to be efficiently exploited on parallel computers, it requires a considerable memory capacity, which can suppose an important drawback, in particular, on GPUs. In the present paper, it is proposed a new GPU-based implementation, which minimizes such requirements with respect to other state-of-the-art implementations. It allows us to execute almost 2xx bigger problems without additional memory transfers, achieving faster executions when dealing with large problems

    Reducing memory requirements for large size LBM simulations on GPUs

    Get PDF
    The scientific community in its never-ending road of larger and more efficient computational resources is in need of more efficient implementations that can adapt efficiently on the current parallel platforms. Graphics processing units are an appropriate platform that cover some of these demands. This architecture presents a high performance with a reduced cost and an efficient power consumption. However, the memory capacity in these devices is reduced and so expensive memory transfers are necessary to deal with big problems. Today, the lattice-Boltzmann method (LBM) has positioned as an efficient approach for Computational Fluid Dynamics simulations. Despite this method is particularly amenable to be efficiently parallelized, it is in need of a considerable memory capacity, which is the consequence of a dramatic fall in performance when dealing with large simulations. In this work, we propose some initiatives to minimize such demand of memory, which allows us to execute bigger simulations on the same platform without additional memory transfers, keeping a high performance. In particular, we present 2 new implementations, LBM-Ghost and LBM-Swap, which are deeply analyzed, presenting the pros and cons of each of them.This project was funded by the Spanish Ministry of Economy and Competitiveness (MINECO): BCAM Severo Ochoa accreditation SEV-2013-0323, MTM2013-40824, Computación de Altas Prestaciones VII TIN2015-65316-P, by the Basque Excellence Research Center (BERC 2014-2017) pro- gram by the Basque Government, and by the Departament d' Innovació, Universitats i Empresa de la Generalitat de Catalunya, under project MPEXPAR: Models de Programació i Entorns d' Execució Paral·lels (2014-SGR-1051). We also thank the support of the computing facilities of Extremadura Research Centre for Advanced Technologies (CETA-CIEMAT) and NVIDIA GPU Research Center program for the provided resources, as well as the support of NVIDIA through the BSC/UPC NVIDIA GPU Center of Excellence.Peer ReviewedPostprint (author's final draft

    A GPU-accelerated package for simulation of flow in nanoporous source rocks with many-body dissipative particle dynamics

    Full text link
    Mesoscopic simulations of hydrocarbon flow in source shales are challenging, in part due to the heterogeneous shale pores with sizes ranging from a few nanometers to a few micrometers. Additionally, the sub-continuum fluid-fluid and fluid-solid interactions in nano- to micro-scale shale pores, which are physically and chemically sophisticated, must be captured. To address those challenges, we present a GPU-accelerated package for simulation of flow in nano- to micro-pore networks with a many-body dissipative particle dynamics (mDPD) mesoscale model. Based on a fully distributed parallel paradigm, the code offloads all intensive workloads on GPUs. Other advancements, such as smart particle packing and no-slip boundary condition in complex pore geometries, are also implemented for the construction and the simulation of the realistic shale pores from 3D nanometer-resolution stack images. Our code is validated for accuracy and compared against the CPU counterpart for speedup. In our benchmark tests, the code delivers nearly perfect strong scaling and weak scaling (with up to 512 million particles) on up to 512 K20X GPUs on Oak Ridge National Laboratory's (ORNL) Titan supercomputer. Moreover, a single-GPU benchmark on ORNL's SummitDev and IBM's AC922 suggests that the host-to-device NVLink can boost performance over PCIe by a remarkable 40\%. Lastly, we demonstrate, through a flow simulation in realistic shale pores, that the CPU counterpart requires 840 Power9 cores to rival the performance delivered by our package with four V100 GPUs on ORNL's Summit architecture. This simulation package enables quick-turnaround and high-throughput mesoscopic numerical simulations for investigating complex flow phenomena in nano- to micro-porous rocks with realistic pore geometries

    Lattice Boltzmann Liquid Simulations on Graphics Hardware

    Get PDF
    Fluid simulation is widely used in the visual effects industry. The high level of detail required to produce realistic visual effects requires significant computation. Usually, expensive computer clusters are used in order to reduce the time required. However, general purpose Graphics Processing Unit (GPU) computing has potential as a relatively inexpensive way to reduce these simulation times. In recent years, GPUs have been used to achieve enormous speedups via their massively parallel architectures. Within the field of fluid simulation, the Lattice Boltzmann Method (LBM) stands out as a candidate for GPU execution because its grid-based structure is a natural fit for GPU parallelism. This thesis describes the design and implementation of a GPU-based free-surface LBM fluid simulation. Broadly, our approach is to ensure that the steps that perform most of the work in the LBM (the stream and collide steps) make efficient use of GPU resources. We achieve this by removing complexity from the core stream and collide steps and handling interactions with obstacles and tracking of the fluid interface in separate GPU kernels. To determine the efficiency of our design, we perform separate, detailed analyses of the performance of the kernels associated with the stream and collide steps of the LBM. We demonstrate that these kernels make efficient use of GPU resources and achieve speedups of 29.6 and 223.7, respectively. Our analysis of the overall performance of all kernels shows that significant time is spent performing obstacle adjustment and interface movement as a result of limitations associated with GPU memory accesses. Lastly, we compare our GPU LBM implementation with a single-core CPU LBM implementation. Our results show speedups of up to 81.6 with no significant differences in output from the simulations on both platforms. We conclude that order of magnitude speedups are possible using GPUs to perform free-surface LBM fluid simulations, and that GPUs can, therefore, significantly reduce the cost of performing high-detail fluid simulations for visual effects

    Computational Explorations in Biomedicine: Unraveling Molecular Dynamics for Cancer, Drug Delivery, and Biomolecular Insights using LAMMPS Simulations

    Full text link
    With the rapid advancement of computational techniques, Molecular Dynamics (MD) simulations have emerged as powerful tools in biomedical research, enabling in-depth investigations of biological systems at the atomic level. Among the diverse range of simulation software available, LAMMPS (Large-scale Atomic/Molecular Massively Parallel Simulator) has gained significant recognition for its versatility, scalability, and extensive range of functionalities. This literature review aims to provide a comprehensive overview of the utilization of LAMMPS in the field of biomedical applications. This review begins by outlining the fundamental principles of MD simulations and highlighting the unique features of LAMMPS that make it suitable for biomedical research. Subsequently, a survey of the literature is conducted to identify key studies that have employed LAMMPS in various biomedical contexts, such as protein folding, drug design, biomaterials, and cellular processes. The reviewed studies demonstrate the remarkable contributions of LAMMPS in understanding the behavior of biological macromolecules, investigating drug-protein interactions, elucidating the mechanical properties of biomaterials, and studying cellular processes at the molecular level. Additionally, this review explores the integration of LAMMPS with other computational tools and experimental techniques, showcasing its potential for synergistic investigations that bridge the gap between theory and experiment. Moreover, this review discusses the challenges and limitations associated with using LAMMPS in biomedical simulations, including the parameterization of force fields, system size limitations, and computational efficiency. Strategies employed by researchers to mitigate these challenges are presented, along with potential future directions for enhancing LAMMPS capabilities in the biomedical field.Comment: 39 pages- 10 figure

    Proceedings of the 5th bwHPC Symposium

    Get PDF
    In modern science, the demand for more powerful and integrated research infrastructures is growing constantly to address computational challenges in data analysis, modeling and simulation. The bwHPC initiative, founded by the Ministry of Science, Research and the Arts and the universities in Baden-Württemberg, is a state-wide federated approach aimed at assisting scientists with mastering these challenges. At the 5th bwHPC Symposium in September 2018, scientific users, technical operators and government representatives came together for two days at the University of Freiburg. The symposium provided an opportunity to present scientific results that were obtained with the help of bwHPC resources. Additionally, the symposium served as a platform for discussing and exchanging ideas concerning the use of these large scientific infrastructures as well as its further development

    Lattice-Boltzmann simulations of cerebral blood flow

    Get PDF
    Computational haemodynamics play a central role in the understanding of blood behaviour in the cerebral vasculature, increasing our knowledge in the onset of vascular diseases and their progression, improving diagnosis and ultimately providing better patient prognosis. Computer simulations hold the potential of accurately characterising motion of blood and its interaction with the vessel wall, providing the capability to assess surgical treatments with no danger to the patient. These aspects considerably contribute to better understand of blood circulation processes as well as to augment pre-treatment planning. Existing software environments for treatment planning consist of several stages, each requiring significant user interaction and processing time, significantly limiting their use in clinical scenarios. The aim of this PhD is to provide clinicians and researchers with a tool to aid in the understanding of human cerebral haemodynamics. This tool employs a high performance fluid solver based on the lattice-Boltzmann method (coined HemeLB), high performance distributed computing and grid computing, and various advanced software applications useful to efficiently set up and run patient-specific simulations. A graphical tool is used to segment the vasculature from patient-specific CT or MR data and configure boundary conditions with ease, creating models of the vasculature in real time. Blood flow visualisation is done in real time using in situ rendering techniques implemented within the parallel fluid solver and aided by steering capabilities; these programming strategies allows the clinician to interactively display the simulation results on a local workstation. A separate software application is used to numerically compare simulation results carried out at different spatial resolutions, providing a strategy to approach numerical validation. This developed software and supporting computational infrastructure was used to study various patient-specific intracranial aneurysms with the collaborating interventionalists at the National Hospital for Neurology and Neuroscience (London), using three-dimensional rotational angiography data to define the patient-specific vasculature. Blood flow motion was depicted in detail by the visualisation capabilities, clearly showing vortex fluid ow features and stress distribution at the inner surface of the aneurysms and their surrounding vasculature. These investigations permitted the clinicians to rapidly assess the risk associated with the growth and rupture of each aneurysm. The ultimate goal of this work is to aid clinical practice with an efficient easy-to-use toolkit for real-time decision support
    corecore