19 research outputs found
Leveraging the Performance of LBM-HPC for Large Sizes on GPUs using Ghost Cells
Today, we are living a growing demand of larger and more efficient computational resources from the scienti c community. On the
other hand, the appearance of GPUs for general purpose computing supposed an important advance for covering such demand. These devices o er an impressive computational capacity at low cost and an efficient power consumption. However, the memory available in these devices is (sometimes) not enough, and so it is necessary computationally expensive memory transfers from (to) CPU to (from) GPU, causing a dramatic fall in performance. Recently, the Lattice-Boltzmann Method has positioned as an e cient methodology for fluid simulations. Although this method presents some interesting features particularly amenable to be efficiently exploited on parallel computers, it requires a considerable memory capacity, which can suppose an important drawback, in particular, on GPUs. In the present paper, it is proposed a new GPU-based implementation, which minimizes such requirements with respect to other state-of-the-art implementations. It allows us to execute almost 2 bigger problems without additional memory transfers, achieving faster executions when dealing with large problems
Leveraging the performance of LBM-HPC for large sizes on GPUs using ghost cells
Today, we are living a growing demand of larger and more efficient computational resources from the scientific community. On the other hand, the appearance of GPUs for general purpose computing supposed an important advance for covering such demand. These devices offer an impressive computational capacity at low cost and an efficient power consumption. However, the memory available in these devices is (sometimes) not enough, and so it is necessary computationally expensive memory transfers from (to) CPU to (from) GPU, causing a dramatic fall in performance. Recently, the Lattice-Boltzmann Method has positioned as an efficient methodology for fluid simulations. Although this method presents some interesting features particularly amenable to be efficiently exploited on parallel computers, it requires a considerable memory capacity, which can suppose an important drawback, in particular, on GPUs. In the present paper, it is proposed a new GPU-based implementation, which minimizes such requirements with respect to other state-of-the-art implementations. It allows us to execute almost 2 bigger problems without additional memory transfers, achieving faster executions when dealing with large problems
Reducing memory requirements for large size LBM simulations on GPUs
The scientific community in its never-ending road of larger and more efficient computational resources is in need of more efficient implementations that can adapt efficiently on the current parallel platforms. Graphics processing units are an appropriate platform that cover some of these demands. This architecture presents a high performance with a reduced cost and an efficient power consumption. However, the memory capacity in these devices is reduced and so expensive memory transfers are necessary to deal with big problems. Today, the lattice-Boltzmann method (LBM) has positioned as an efficient approach for Computational Fluid Dynamics simulations. Despite this method is particularly amenable to be efficiently parallelized, it is in need of a considerable memory capacity, which is the consequence of a dramatic fall in performance when dealing with large simulations. In this work, we propose some initiatives to minimize such demand of memory, which allows us to execute bigger simulations on the same platform without additional memory transfers, keeping a high performance. In particular, we present 2 new implementations, LBM-Ghost and LBM-Swap, which are deeply analyzed, presenting the pros and cons of each of them.This project was funded by the Spanish Ministry of Economy and Competitiveness (MINECO): BCAM Severo Ochoa accreditation SEV-2013-0323, MTM2013-40824, Computación de Altas Prestaciones VII TIN2015-65316-P, by the Basque Excellence Research Center (BERC 2014-2017) pro-
gram by the Basque Government, and by the Departament d' Innovació, Universitats i Empresa de la Generalitat de Catalunya, under project MPEXPAR: Models de Programació i Entorns d' Execució Paral·lels (2014-SGR-1051). We also thank the support of the computing facilities of Extremadura Research Centre for Advanced Technologies (CETA-CIEMAT) and NVIDIA GPU Research Center program for the provided resources,
as well as the support of NVIDIA through the BSC/UPC NVIDIA GPU Center of Excellence.Peer ReviewedPostprint (author's final draft
A GPU-accelerated package for simulation of flow in nanoporous source rocks with many-body dissipative particle dynamics
Mesoscopic simulations of hydrocarbon flow in source shales are challenging,
in part due to the heterogeneous shale pores with sizes ranging from a few
nanometers to a few micrometers. Additionally, the sub-continuum fluid-fluid
and fluid-solid interactions in nano- to micro-scale shale pores, which are
physically and chemically sophisticated, must be captured. To address those
challenges, we present a GPU-accelerated package for simulation of flow in
nano- to micro-pore networks with a many-body dissipative particle dynamics
(mDPD) mesoscale model. Based on a fully distributed parallel paradigm, the
code offloads all intensive workloads on GPUs. Other advancements, such as
smart particle packing and no-slip boundary condition in complex pore
geometries, are also implemented for the construction and the simulation of the
realistic shale pores from 3D nanometer-resolution stack images. Our code is
validated for accuracy and compared against the CPU counterpart for speedup. In
our benchmark tests, the code delivers nearly perfect strong scaling and weak
scaling (with up to 512 million particles) on up to 512 K20X GPUs on Oak Ridge
National Laboratory's (ORNL) Titan supercomputer. Moreover, a single-GPU
benchmark on ORNL's SummitDev and IBM's AC922 suggests that the host-to-device
NVLink can boost performance over PCIe by a remarkable 40\%. Lastly, we
demonstrate, through a flow simulation in realistic shale pores, that the CPU
counterpart requires 840 Power9 cores to rival the performance delivered by our
package with four V100 GPUs on ORNL's Summit architecture. This simulation
package enables quick-turnaround and high-throughput mesoscopic numerical
simulations for investigating complex flow phenomena in nano- to micro-porous
rocks with realistic pore geometries
Lattice Boltzmann Liquid Simulations on Graphics Hardware
Fluid simulation is widely used in the visual effects industry. The high level of detail required to produce realistic visual effects requires significant computation. Usually, expensive computer clusters are used in order to reduce the time required. However, general purpose Graphics Processing Unit (GPU) computing has potential as a relatively inexpensive
way to reduce these simulation times. In recent years, GPUs have been used to achieve enormous speedups via their massively parallel architectures. Within the field of fluid simulation, the Lattice Boltzmann Method (LBM) stands out as a candidate for GPU execution because its grid-based structure is a natural fit for GPU parallelism.
This thesis describes the design and implementation of a GPU-based free-surface LBM fluid simulation. Broadly, our approach is to ensure that the steps that perform most of the work in the LBM (the stream and collide steps) make efficient use of GPU resources. We achieve this by removing complexity from the core stream and collide steps and handling interactions with obstacles and tracking of the fluid interface in separate GPU kernels.
To determine the efficiency of our design, we perform separate, detailed analyses of the performance of the kernels associated with the stream and collide steps of the LBM. We demonstrate that these kernels make efficient use of GPU resources and achieve speedups of 29.6 and 223.7, respectively. Our analysis of the overall performance of all kernels
shows that significant time is spent performing obstacle adjustment and interface movement as a result of limitations associated with GPU memory accesses. Lastly, we compare our GPU LBM implementation with a single-core CPU LBM implementation. Our results show speedups of up to 81.6 with no significant differences in output from the simulations
on both platforms.
We conclude that order of magnitude speedups are possible using GPUs to perform free-surface LBM fluid simulations, and that GPUs can, therefore, significantly reduce the cost of performing high-detail fluid simulations for visual effects
Computational Explorations in Biomedicine: Unraveling Molecular Dynamics for Cancer, Drug Delivery, and Biomolecular Insights using LAMMPS Simulations
With the rapid advancement of computational techniques, Molecular Dynamics
(MD) simulations have emerged as powerful tools in biomedical research,
enabling in-depth investigations of biological systems at the atomic level.
Among the diverse range of simulation software available, LAMMPS (Large-scale
Atomic/Molecular Massively Parallel Simulator) has gained significant
recognition for its versatility, scalability, and extensive range of
functionalities. This literature review aims to provide a comprehensive
overview of the utilization of LAMMPS in the field of biomedical applications.
This review begins by outlining the fundamental principles of MD simulations
and highlighting the unique features of LAMMPS that make it suitable for
biomedical research. Subsequently, a survey of the literature is conducted to
identify key studies that have employed LAMMPS in various biomedical contexts,
such as protein folding, drug design, biomaterials, and cellular processes. The
reviewed studies demonstrate the remarkable contributions of LAMMPS in
understanding the behavior of biological macromolecules, investigating
drug-protein interactions, elucidating the mechanical properties of
biomaterials, and studying cellular processes at the molecular level.
Additionally, this review explores the integration of LAMMPS with other
computational tools and experimental techniques, showcasing its potential for
synergistic investigations that bridge the gap between theory and experiment.
Moreover, this review discusses the challenges and limitations associated with
using LAMMPS in biomedical simulations, including the parameterization of force
fields, system size limitations, and computational efficiency. Strategies
employed by researchers to mitigate these challenges are presented, along with
potential future directions for enhancing LAMMPS capabilities in the biomedical
field.Comment: 39 pages- 10 figure
Proceedings of the 5th bwHPC Symposium
In modern science, the demand for more powerful and integrated research
infrastructures is growing constantly to address computational challenges
in data analysis, modeling and simulation. The bwHPC initiative, founded
by the Ministry of Science, Research and the Arts and the universities in
Baden-Württemberg, is a state-wide federated approach aimed at assisting
scientists with mastering these challenges. At the 5th bwHPC Symposium
in September 2018, scientific users, technical operators and government
representatives came together for two days at the University of Freiburg. The
symposium provided an opportunity to present scientific results that were
obtained with the help of bwHPC resources. Additionally, the symposium served
as a platform for discussing and exchanging ideas concerning the use of these
large scientific infrastructures as well as its further development
Lattice-Boltzmann simulations of cerebral blood flow
Computational haemodynamics play a central role in the understanding of blood behaviour
in the cerebral vasculature, increasing our knowledge in the onset of vascular
diseases and their progression, improving diagnosis and ultimately providing better
patient prognosis. Computer simulations hold the potential of accurately characterising
motion of blood and its interaction with the vessel wall, providing the capability to
assess surgical treatments with no danger to the patient. These aspects considerably
contribute to better understand of blood circulation processes as well as to augment
pre-treatment planning. Existing software environments for treatment planning consist
of several stages, each requiring significant user interaction and processing time,
significantly limiting their use in clinical scenarios.
The aim of this PhD is to provide clinicians and researchers with a tool to aid
in the understanding of human cerebral haemodynamics. This tool employs a high
performance
fluid solver based on the lattice-Boltzmann method (coined HemeLB),
high performance distributed computing and grid computing, and various advanced
software applications useful to efficiently set up and run patient-specific simulations.
A graphical tool is used to segment the vasculature from patient-specific CT or MR
data and configure boundary conditions with ease, creating models of the vasculature
in real time. Blood flow visualisation is done in real time using in situ rendering
techniques implemented within the parallel
fluid solver and aided by steering capabilities;
these programming strategies allows the clinician to interactively display the
simulation results on a local workstation. A separate software application is used
to numerically compare simulation results carried out at different spatial resolutions,
providing a strategy to approach numerical validation. This developed software and
supporting computational infrastructure was used to study various patient-specific
intracranial aneurysms with the collaborating interventionalists at the National Hospital
for Neurology and Neuroscience (London), using three-dimensional rotational
angiography data to define the patient-specific vasculature. Blood flow motion was
depicted in detail by the visualisation capabilities, clearly showing vortex fluid
ow features and stress distribution at the inner surface of the aneurysms and their surrounding
vasculature. These investigations permitted the clinicians to rapidly assess
the risk associated with the growth and rupture of each aneurysm. The ultimate goal
of this work is to aid clinical practice with an efficient easy-to-use toolkit for real-time
decision support