94 research outputs found

    Study of fluid displacement in 3D porous media with an improved multi-component pseudopotential lattice Boltzmann method

    Full text link
    We generalize to three dimensions (3D) a recently developed improved multi-component pseudopotential lattice Boltzmann method and analyze its applicability to simulate flows through realistic porous media. The model is validated and characterized via benchmarks, and we investigate its performance by simulating the displacement of immiscible fluids in 3D geometries. Two samples are considered, namely, a pack of spheres obtained numerically, and a Bentheimer sandstone rock sample obtained experimentally. We show that, with this model it is possible to simulate realistic viscosity ratios, to tune surface tension independently and, most importantly, to preserve the volume of trapped fluid. We also evaluate the computational performance of the model on the Graphical Processing Unit (GPU) and mention the implemented optimizations to increase the computational speed and reduce the memory requirements.Comment: arXiv admin note: text overlap with arXiv:2111.0866

    Transport in complex systems : a lattice Boltzmann approach

    Get PDF
    Celem niniejszej pracy jest zbadanie możliwości efektywnego modelowania procesów transportu w złożonych systemach z zakresu dynamiki płynów za pomocą metody siatkowej Boltzmanna (LBM). Złożoność systemu została potraktowana wieloaspektowo i konkretne układy, które poddano analizie pokrywały szeroki zakres zagadnień fizycznych, m.in. przepływy wielofazowe, hemodynamikę oraz turbulencje. We wszystkich przypadkach szczególna uwaga została zwrócona na aspekty numeryczne — dokładność używanych modeli, jak również szybkość z jaką pozwalają one uzyskać zadowalające rozwiązanie. W ramach pracy rozwinięty został pakiet oprogramowania Sailfish, będący otwarta implementacja metody siatkowej Boltzmanna na procesory kart graficznych (GPU). Po analizie szybkości jego działania, walidacji oraz omówieniu założeń projektowych, pakiet ten został użyty do symulacji trzech typów przepływów. Pierwszym z nich były przepływy typu Brethertona/Taylora w dwu- i trójwymiarowych geometriach, do symulacji których zastosowano model energii swobodnej. Analiza otrzymanych wyników pokazała dobra zgodność z danymi dostępnymi w literaturze, zarówno eksperymentalnymi, jak i otrzymanymi za pomocą innych metod numerycznych. Drugim badanym problemem były przepływy krwi w realistycznych geometriach tętnic dostarczających krew do ludzkiego mózgu. Wyniki symulacji zostały dokładnie porównane z rozwiązaniem otrzymanym metoda objętości skończonych z wykorzystaniem pakietu OpenFOAM, przyspieszonego komercyjna biblioteka pozwalająca na wykonywanie obliczeń na GPU. Otrzymano dobra zgodność między badanymi metodami oraz pokazano, że metoda siatkowa Boltzmanna pozwala na wykonywanie symulacji do ok. 20 razy szybciej. Trzecim przeanalizowanym zagadnieniem były turbulentne przepływy w prostych geometriach. Po zwalidowaniu wszystkich zaimplementowanych modeli relaksacji na przypadku wiru Kidy, zbadano przepływy w pustym kanale oraz w obecności przeszkód. Do symulacji wykorzystano zarówno siatki zapewniające pełną rozdzielczość aż do skal Kolmogorova, jak i siatki o mniejszej rozdzielczości. Również w tym kontekście pokazano dobrą zgodność wyników otrzymanych metodą siatkową Boltzmanna z wynikami innych symulacji oraz badaniami eksperymentalnymi. Pokazano również, że implementacja LBM w pakiecie Sailfish zapewnia większą stabilność obliczeń niż ta opisana w literaturze dla tych samych przepływów i modeli relaksacji

    Accuracy and performance of the lattice Boltzmann method with 64-bit, 32-bit, and customized 16-bit number formats

    Get PDF
    Fluid dynamics simulations with the lattice Boltzmann method (LBM) are very memory-intensive. Alongside reduction in memory footprint, significant performance benefits can be achieved by using FP32 (single) precision compared to FP64 (double) precision, especially on GPUs. Here, we evaluate the possibility to use even FP16 and Posit16 (half) precision for storing fluid populations, while still carrying arithmetic operations in FP32. For this, we first show that the commonly occurring number range in the LBM is a lot smaller than the FP16 number range. Based on this observation, we develop novel 16-bit formats - based on a modified IEEE-754 and on a modified Posit standard - that are specifically tailored to the needs of the LBM. We then carry out an in-depth characterization of LBM accuracy for six different test systems with increasing complexity: Poiseuille flow, Taylor-Green vortices, Karman vortex streets, lid-driven cavity, a microcapsule in shear flow (utilizing the immersed-boundary method) and finally the impact of a raindrop (based on a Volume-of-Fluid approach). We find that the difference in accuracy between FP64 and FP32 is negligible in almost all cases, and that for a large number of cases even 16-bit is sufficient. Finally, we provide a detailed performance analysis of all precision levels on a large number of hardware microarchitectures and show that significant speedup is achieved with mixed FP32/16-bit.Comment: 30 pages, 20 figures, 4 tables, 2 code listing

    Efficient Algorithms And Optimizations For Scientific Computing On Many-Core Processors

    Get PDF
    Designing efficient algorithms for many-core and multicore architectures requires using different strategies to allow for the best exploitation of the hardware resources on those architectures. Researchers have ported many scientific applications to modern many-core and multicore parallel architectures, and by doing so they have achieved significant speedups over running on single CPU cores. While many applications have achieved significant speedups, some applications still require more effort to accelerate due to their inherently serial behavior. One class of applications that has this serial behavior is the Monte Carlo simulations. Monte Carlo simulations have been used to simulate many problems in statistical physics and statistical mechanics that were not possible to simulate using Molecular Dynamics. While there are a fair number of well-known and recognized GPU Molecular Dynamics codes, the existing Monte Carlo ensemble simulations have not been ported to the GPU, so they are relatively slow and could not run large systems in a reasonable amount of time. Due to the previously mentioned shortcomings of existing Monte Carlo ensemble codes and due to the interest of researchers to have a fast Monte Carlo simulation framework that can simulate large systems, a new GPU framework called GOMC is implemented to simulate different particle and molecular-based force fields and ensembles. GOMC simulates different Monte Carlo ensembles such as the canonical, grand canonical, and Gibbs ensembles. This work describes many challenges in developing a GPU Monte Carlo code for such ensembles and how I addressed these challenges. This work also describes efficient many-core and multicore large-scale energy calculations for Monte Carlo Gibbs ensemble using cell lists. Designing Monte Carlo molecular simulations is challenging as they have less computation and parallelism when compared to similar molecular dynamics applications. The modified cell list allows for more speedup gains for energy calculations on both many-core and multicore architectures when compared to other implementations without using the conventional cell lists. The work presents results and analysis of the cell list algorithms for each one of the parallel architectures using top of the line GPUs, CPUs, and Intel’s Phi coprocessors. In addition, the work evaluates the performance of the cell list algorithms for different problem sizes and different radial cutoffs. In addition, this work evaluates two cell list approaches, a hybrid MPI+OpenMP approach and a hybrid MPI+CUDA approach. The cell list methods are evaluated on a small cluster of multicore CPUs, Intel Phi coprocessors, and GPUs. The performance results are evaluated using different combinations of MPI processes, threads, and problem sizes. Another application presented in this dissertation involves the understanding of the properties of crystalline materials, and their design and control. Recent developments include the introduction of new models to simulate system behavior and properties that are of large experimental and theoretical interest. One of those models is the Phase-Field Crystal (PFC) model. The PFC model has enabled researchers to simulate 2D and 3D crystal structures and study defects such as dislocations and grain boundaries. In this work, GPUs are used to accelerate various dynamic properties of polycrystals in the 2D PFC model. Some properties require very intensive computation that may involve hundreds of thousands of atoms. The GPU implementation has achieved significant speedups of more than 46 times for some large systems simulations

    Real-Time Simulation of Indoor Air Flow using the Lattice Boltzmann Method on Graphics Processing Unit

    Get PDF
    This thesis investigates the usability of the lattice Boltzmann method (LBM) for the simulation of indoor air flows in real-time. It describes the work undertaken during the three years of a Ph.D. study in the School of Mechanical Engineering at the University of Leeds, England. Real-time fluid simulation, i.e. the ability to simulate a virtual system as fast as the real system would evolve, can benefit to many engineering application such as the optimisation of the ventilation system design in data centres or the simulation of pollutant transport in hospitals. And although real-time fluid simulation is an active field of research in computer graphics, these are generally focused on creating visually appealing animation rather than aiming for physical accuracy. The approach taken for this thesis is different as it starts from a physics based model, the lattice Boltzmann method, and takes advantage of the computational power of a graphics processing unit (GPU) to achieve real-time compute capability while maintaining good physical accuracy. The lattice Boltzmann method is reviewed and detailed references are given a variety of models. Particular attention is given to turbulence modelling using the Smagorinsky model in LBM for the simulation of high Reynolds number flow and the coupling of two LBM simulations to simulate thermal flows under the Boussinesq approximation. A detailed analysis of the implementation of the LBM on GPU is conducted. A special attention is given to the optimisation of the algorithm, and the program kernel is shown to achieve a performance of up to 1.5 billion lattice node updates per second, which is found to be sufficient for coarse real-time simulations. Additionally, a review of the real-time visualisation integrated within the program is presented and some of the techniques for automated code generation are introduced. The resulting software is validated against benchmark flows, using their analytical solutions whenever possible, or against other simulation results obtained using accepted method from classical computational fluid dynamics (CFD) either as published in the literature or simulated in-house. The LBM is shown to resolve the flow with similar accuracy and in less time

    Multi-objective optimisation methods applied to aircraft techno-economic and environmental issues

    No full text
    Engineering methods that couple multi-objective optimisation (MOO) techniques with high fidelity computational tools are expected to minimise the environmental impact of aviation while increasing the growth, with the potential to reveal innovative solutions. In order to mitigate the compromise between computational efficiency and fidelity, these methods can be accelerated by harnessing the computational efficiency of Graphic Processor Units (GPUs). The aim of the research is to develop a family of engineering methods to support research in aviation with respect to the environmental and economic aspects. In order to reveal the non-dominated trade-o_, also known as Pareto Front(PF), among conflicting objectives, a MOO algorithm, called Multi-Objective Tabu Search 2 (MOTS2), is developed, benchmarked relative to state-of-the-art methods and accelerated by using GPUs. A prototype fluid solver based on GPU is also developed, so as to simulate the mixing capability of a microreactor that could potentially be used in fuel-saving technologies in aviation. By using the aforementioned methods, optimal aircraft trajectories in terms of flight time, fuel consumption and emissions are generated, and alternative designs of a microreactor are suggested, so as to assess the trade-offs between pressure losses and the micro-mixing capability. As a key contribution to knowledge, with reference to competitive optimisers and previous cases, the capabilities of the proposed methodology are illustrated in prototype applications of aircraft trajectory optimisation (ATO) and micromixing optimisation with 2 and 3 objectives, under operational and geometrical constraints, respectively. In the short-term, ATO ought to be applied to existing aircraft. In the long-term, improving the micro-mixing capability of a microreactor is expected to enable the use of hydrogen-based fuel. This methodology is also benchmarked and assessed relative to state-of-the-art techniques in ATO and micro-mixing optimisation with known and unknown trade-offs, whereas the former could only optimise 2 objectives and the latter could not exploit the computational efficiency of GPUs. The impact of deploying on GPUs a micro-mixing _ow solver, which accelerates the generation of trade-off against a reference study, and MOTS2, which illustrates the scalability potential, is assessed. With regard to standard analytical function test cases and verification cases in MOO, MOTS2 can handle the multi-modality of the trade-o_ of ZDT4, which is a MOO benchmark function with many local optima that presents a challenge for a state-of-the-art genetic algorithm for ATO, called NSGAMO, based on case studies in the public domain. However, MOTS2 demonstrated worse performance on ZDT3, which is a MOO benchmark function with a discontinuous trade-o_, for which NSGAMO successfully captured the target PF. Comparing their overall performance, if the shape of the PF is known, MOTS2 should be preferred in problems with multi-modal trade-offs, whereas NSGAMO should be employed in discontinuous PFs. The shape of the trade-o_ between the objectives in airfoil shape optimisation, ATO and micro-mixing optimisation was continuous. The weakness of MOTS2 to sufficiently capture the discontinuous PF of ZDT3 was not critical in the studied examples … [cont.]
    corecore