94 research outputs found
Study of fluid displacement in 3D porous media with an improved multi-component pseudopotential lattice Boltzmann method
We generalize to three dimensions (3D) a recently developed improved
multi-component pseudopotential lattice Boltzmann method and analyze its
applicability to simulate flows through realistic porous media. The model is
validated and characterized via benchmarks, and we investigate its performance
by simulating the displacement of immiscible fluids in 3D geometries. Two
samples are considered, namely, a pack of spheres obtained numerically, and a
Bentheimer sandstone rock sample obtained experimentally. We show that, with
this model it is possible to simulate realistic viscosity ratios, to tune
surface tension independently and, most importantly, to preserve the volume of
trapped fluid. We also evaluate the computational performance of the model on
the Graphical Processing Unit (GPU) and mention the implemented optimizations
to increase the computational speed and reduce the memory requirements.Comment: arXiv admin note: text overlap with arXiv:2111.0866
Transport in complex systems : a lattice Boltzmann approach
Celem niniejszej pracy jest zbadanie możliwości efektywnego modelowania procesów transportu w złożonych systemach z zakresu dynamiki płynów za pomocą metody siatkowej Boltzmanna (LBM). Złożoność systemu została potraktowana wieloaspektowo i konkretne układy, które poddano analizie pokrywały szeroki zakres zagadnień fizycznych, m.in. przepływy wielofazowe, hemodynamikę oraz turbulencje. We wszystkich
przypadkach szczególna uwaga została zwrócona na aspekty numeryczne — dokładność używanych modeli, jak również szybkość z jaką pozwalają one uzyskać zadowalające rozwiązanie.
W ramach pracy rozwinięty został pakiet oprogramowania Sailfish, będący otwarta implementacja
metody siatkowej Boltzmanna na procesory kart graficznych (GPU). Po analizie szybkości jego działania, walidacji oraz omówieniu założeń projektowych, pakiet ten został użyty do symulacji trzech typów przepływów.
Pierwszym z nich były przepływy typu Brethertona/Taylora w dwu- i trójwymiarowych geometriach, do symulacji których zastosowano model energii swobodnej. Analiza otrzymanych wyników pokazała dobra zgodność z danymi dostępnymi w literaturze, zarówno eksperymentalnymi, jak i otrzymanymi za pomocą innych metod numerycznych. Drugim badanym problemem były przepływy krwi w realistycznych geometriach tętnic dostarczających krew do ludzkiego mózgu. Wyniki symulacji zostały dokładnie porównane z rozwiązaniem otrzymanym metoda objętości skończonych z wykorzystaniem pakietu OpenFOAM, przyspieszonego komercyjna biblioteka pozwalająca na wykonywanie obliczeń na GPU. Otrzymano dobra zgodność między badanymi metodami oraz pokazano, że metoda siatkowa Boltzmanna pozwala na wykonywanie symulacji do ok. 20 razy szybciej. Trzecim przeanalizowanym zagadnieniem były turbulentne przepływy w prostych geometriach. Po zwalidowaniu wszystkich zaimplementowanych modeli relaksacji na przypadku wiru Kidy, zbadano przepływy w pustym kanale oraz w obecności przeszkód. Do symulacji wykorzystano zarówno siatki zapewniające pełną rozdzielczość aż do skal Kolmogorova, jak i siatki o mniejszej rozdzielczości. Również w tym kontekście pokazano dobrą zgodność wyników otrzymanych metodą siatkową Boltzmanna z wynikami innych symulacji oraz badaniami eksperymentalnymi. Pokazano również, że implementacja LBM w pakiecie Sailfish zapewnia większą stabilność obliczeń niż ta opisana w literaturze dla tych samych przepływów i modeli relaksacji
Accuracy and performance of the lattice Boltzmann method with 64-bit, 32-bit, and customized 16-bit number formats
Fluid dynamics simulations with the lattice Boltzmann method (LBM) are very
memory-intensive. Alongside reduction in memory footprint, significant
performance benefits can be achieved by using FP32 (single) precision compared
to FP64 (double) precision, especially on GPUs. Here, we evaluate the
possibility to use even FP16 and Posit16 (half) precision for storing fluid
populations, while still carrying arithmetic operations in FP32. For this, we
first show that the commonly occurring number range in the LBM is a lot smaller
than the FP16 number range. Based on this observation, we develop novel 16-bit
formats - based on a modified IEEE-754 and on a modified Posit standard - that
are specifically tailored to the needs of the LBM. We then carry out an
in-depth characterization of LBM accuracy for six different test systems with
increasing complexity: Poiseuille flow, Taylor-Green vortices, Karman vortex
streets, lid-driven cavity, a microcapsule in shear flow (utilizing the
immersed-boundary method) and finally the impact of a raindrop (based on a
Volume-of-Fluid approach). We find that the difference in accuracy between FP64
and FP32 is negligible in almost all cases, and that for a large number of
cases even 16-bit is sufficient. Finally, we provide a detailed performance
analysis of all precision levels on a large number of hardware
microarchitectures and show that significant speedup is achieved with mixed
FP32/16-bit.Comment: 30 pages, 20 figures, 4 tables, 2 code listing
Efficient Algorithms And Optimizations For Scientific Computing On Many-Core Processors
Designing efficient algorithms for many-core and multicore architectures requires using different strategies to allow for the best exploitation of the hardware resources on those architectures. Researchers have ported many scientific applications to modern many-core and multicore parallel architectures, and by doing so they have achieved significant speedups over running on single CPU cores. While many applications have achieved significant speedups, some applications still require more effort to accelerate due to their inherently serial behavior.
One class of applications that has this serial behavior is the Monte Carlo simulations. Monte Carlo simulations have been used to simulate many problems in statistical physics and statistical mechanics that were not possible to simulate using Molecular Dynamics. While there are a fair number of well-known and recognized GPU Molecular Dynamics codes, the existing Monte Carlo ensemble simulations have not been ported to the GPU, so they are relatively slow and could not run large systems in a reasonable amount of time. Due to the previously mentioned shortcomings of existing Monte Carlo ensemble codes and due to the interest of researchers to have a fast Monte Carlo simulation framework that can simulate large systems, a new GPU framework called GOMC is implemented to simulate different particle and molecular-based force fields and ensembles. GOMC simulates different Monte Carlo ensembles such as the canonical, grand canonical, and Gibbs ensembles. This work describes many challenges in developing a GPU Monte Carlo code for such ensembles and how I addressed these challenges.
This work also describes efficient many-core and multicore large-scale energy calculations for Monte Carlo Gibbs ensemble using cell lists. Designing Monte Carlo molecular simulations is challenging as they have less computation and parallelism when compared to similar molecular dynamics applications. The modified cell list allows for more speedup gains for energy calculations on both many-core and multicore architectures when compared to other implementations without using the conventional cell lists. The work presents results and analysis of the cell list algorithms for each one of the parallel architectures using top of the line GPUs, CPUs, and Intel’s Phi coprocessors. In addition, the work evaluates the performance of the cell list algorithms for different problem sizes and different radial cutoffs.
In addition, this work evaluates two cell list approaches, a hybrid MPI+OpenMP approach and a hybrid MPI+CUDA approach. The cell list methods are evaluated on a small cluster of multicore CPUs, Intel Phi coprocessors, and GPUs. The performance results are evaluated using different combinations of MPI processes, threads, and problem sizes.
Another application presented in this dissertation involves the understanding of the properties of crystalline materials, and their design and control. Recent developments include the introduction of new models to simulate system behavior and properties that are of large experimental and theoretical interest. One of those models is the Phase-Field Crystal (PFC) model. The PFC model has enabled researchers to simulate 2D and 3D crystal structures and study defects such as dislocations and grain boundaries. In this work, GPUs are used to accelerate various dynamic properties of polycrystals in the 2D PFC model. Some properties require very intensive computation that may involve hundreds of thousands of atoms. The GPU implementation has achieved significant speedups of more than 46 times for some large systems simulations
Real-Time Simulation of Indoor Air Flow using the Lattice Boltzmann Method on Graphics Processing Unit
This thesis investigates the usability of the lattice Boltzmann method (LBM) for the simulation of indoor air flows in real-time. It describes the work undertaken during the three years of a Ph.D. study in the School of Mechanical Engineering at the University of Leeds, England.
Real-time fluid simulation, i.e. the ability to simulate a virtual system as fast as the real system would evolve, can benefit to many engineering application such as the optimisation of the ventilation system design in data centres or the simulation of pollutant transport in hospitals. And although real-time fluid simulation is an active field of research in computer graphics, these are generally focused on creating visually appealing animation rather than aiming for physical accuracy. The approach taken for this thesis is different as it starts from a
physics based model, the lattice Boltzmann method, and takes advantage of the computational power of a graphics processing unit (GPU) to achieve real-time compute capability while maintaining good physical accuracy.
The lattice Boltzmann method is reviewed and detailed references are given a variety of models. Particular attention is given to turbulence modelling using the Smagorinsky model in LBM for the simulation of high Reynolds number flow and the coupling of two LBM simulations to simulate thermal flows under the Boussinesq approximation.
A detailed analysis of the implementation of the LBM on GPU is conducted. A special attention is given to the optimisation of the algorithm, and the program kernel is shown to achieve a performance of up to 1.5 billion lattice node updates per second, which is found to be sufficient for coarse real-time simulations. Additionally, a review of the real-time visualisation integrated within the program is
presented and some of the techniques for automated code generation are introduced.
The resulting software is validated against benchmark flows, using their analytical solutions whenever possible, or against other simulation results obtained using accepted method from classical computational fluid dynamics (CFD) either as published in the literature or simulated in-house. The LBM is shown to resolve the flow with
similar accuracy and in less time
Multi-objective optimisation methods applied to aircraft techno-economic and environmental issues
Engineering methods that couple multi-objective optimisation (MOO) techniques
with high fidelity computational tools are expected to minimise the environmental
impact of aviation while increasing the growth, with the potential to reveal innovative
solutions. In order to mitigate the compromise between computational
efficiency and fidelity, these methods can be accelerated by harnessing the computational
efficiency of Graphic Processor Units (GPUs).
The aim of the research is to develop a family of engineering methods to support
research in aviation with respect to the environmental and economic aspects. In order
to reveal the non-dominated trade-o_, also known as Pareto Front(PF), among
conflicting objectives, a MOO algorithm, called Multi-Objective Tabu Search 2
(MOTS2), is developed, benchmarked relative to state-of-the-art methods and accelerated
by using GPUs. A prototype fluid solver based on GPU is also developed,
so as to simulate the mixing capability of a microreactor that could potentially be
used in fuel-saving technologies in aviation. By using the aforementioned methods,
optimal aircraft trajectories in terms of flight time, fuel consumption and emissions
are generated, and alternative designs of a microreactor are suggested, so as
to assess the trade-offs between pressure losses and the micro-mixing capability.
As a key contribution to knowledge, with reference to competitive optimisers
and previous cases, the capabilities of the proposed methodology are illustrated
in prototype applications of aircraft trajectory optimisation (ATO) and micromixing
optimisation with 2 and 3 objectives, under operational and geometrical
constraints, respectively. In the short-term, ATO ought to be applied to existing
aircraft. In the long-term, improving the micro-mixing capability of a microreactor
is expected to enable the use of hydrogen-based fuel. This methodology
is also benchmarked and assessed relative to state-of-the-art techniques in ATO
and micro-mixing optimisation with known and unknown trade-offs, whereas the
former could only optimise 2 objectives and the latter could not exploit the computational
efficiency of GPUs. The impact of deploying on GPUs a micro-mixing
_ow solver, which accelerates the generation of trade-off against a reference study,
and MOTS2, which illustrates the scalability potential, is assessed.
With regard to standard analytical function test cases and verification cases
in MOO, MOTS2 can handle the multi-modality of the trade-o_ of ZDT4, which
is a MOO benchmark function with many local optima that presents a challenge
for a state-of-the-art genetic algorithm for ATO, called NSGAMO, based on case
studies in the public domain. However, MOTS2 demonstrated worse performance
on ZDT3, which is a MOO benchmark function with a discontinuous trade-o_,
for which NSGAMO successfully captured the target PF. Comparing their overall
performance, if the shape of the PF is known, MOTS2 should be preferred in
problems with multi-modal trade-offs, whereas NSGAMO should be employed in discontinuous PFs. The shape of the trade-o_ between the objectives in airfoil
shape optimisation, ATO and micro-mixing optimisation was continuous. The
weakness of MOTS2 to sufficiently capture the discontinuous PF of ZDT3 was not
critical in the studied examples … [cont.]
- …