4,717 research outputs found

    Free-Surface Lattice-Boltzmann Simulation on Many-Core Architectures

    Get PDF
    AbstractCurrent advances in many-core technologies demand simulation algorithms suited for the corresponding architectures while with regard to the respective increase of computational power, real-time and interactive simulations become possible and desirable. We present an OpenCL implementation of a Lattice-Boltzmann-based free-surface solver for GPU architectures. The massively parallel execution especially requires special techniques to keep the interface region consistent, which is here addressed by a novel multipass method. We further compare different memory layouts according to their performance for both a basic driven cavity implementation and the free-surface method, pointing out the capabilities of our implementation in real-time and interactive scenarios, and shortly present visualizations of the flow, obtained in real-time

    A Flexible Patch-Based Lattice Boltzmann Parallelization Approach for Heterogeneous GPU-CPU Clusters

    Full text link
    Sustaining a large fraction of single GPU performance in parallel computations is considered to be the major problem of GPU-based clusters. In this article, this topic is addressed in the context of a lattice Boltzmann flow solver that is integrated in the WaLBerla software framework. We propose a multi-GPU implementation using a block-structured MPI parallelization, suitable for load balancing and heterogeneous computations on CPUs and GPUs. The overhead required for multi-GPU simulations is discussed in detail and it is demonstrated that the kernel performance can be sustained to a large extent. With our GPU implementation, we achieve nearly perfect weak scalability on InfiniBand clusters. However, in strong scaling scenarios multi-GPUs make less efficient use of the hardware than IBM BG/P and x86 clusters. Hence, a cost analysis must determine the best course of action for a particular simulation task. Additionally, weak scaling results of heterogeneous simulations conducted on CPUs and GPUs simultaneously are presented using clusters equipped with varying node configurations.Comment: 20 pages, 12 figure

    Steering in computational science: mesoscale modelling and simulation

    Full text link
    This paper outlines the benefits of computational steering for high performance computing applications. Lattice-Boltzmann mesoscale fluid simulations of binary and ternary amphiphilic fluids in two and three dimensions are used to illustrate the substantial improvements which computational steering offers in terms of resource efficiency and time to discover new physics. We discuss details of our current steering implementations and describe their future outlook with the advent of computational grids.Comment: 40 pages, 11 figures. Accepted for publication in Contemporary Physic

    A Simulation Suite for Lattice-Boltzmann based Real-Time CFD Applications Exploiting Multi-Level Parallelism on Modern Multi- and Many-Core Architectures

    Get PDF
    We present a software approach to hardware-oriented numerics which builds upon an augmented, previously published open-source set of libraries facilitating portable code development and optimisation on a wide range of modern computer architectures. In order to maximise eficiency, we exploit all levels of arallelism, including vectorisation within CPU cores, the Cell BE and GPUs, shared memory thread-level parallelism between cores, and parallelism between heterogeneous distributed memory resources in clusters. To evaluate and validate our approach, we implement a collection of modular building blocks for the easy and fast assembly and development of CFD applications based on the shallow water equations: We combine the Lattice-Boltzmann method with i-uid-structure interaction techniques in order to achieve real-time simulations targeting interactive virtual environments. Our results demonstrate that recent multi-core CPUs outperform the Cell BE, while GPUs are significantly faster than conventional multi-threaded SSE code. In addition, we verify good scalability properties of our application on small clusters

    A Multi-Core Numerical Framework for Characterizing Flow in Oil Reservoirs

    Get PDF
    Presented at the SCS Spring Simulation Multi-Conference – SpringSim 2011, April 4-7, 2011 – Boston, USA Awarded Best Paper in the 19th High Performance Computing Symposium and Best Overall Paper at SpringSim 2011.This paper presents a numerical framework that enables scalable, parallel execution of engineering simulations on multi-core, shared memory architectures. Distribution of the simulations is done by selective hash-tabling of the model domain which spatially decomposes it into a number of orthogonal computational tasks. These tasks, the size of which is critical to optimal cache blocking and consequently performance, are then distributed for execution to multiple threads using the previously presented task management algorithm, H-Dispatch. Two numerical methods, smoothed particle hydrodynamics (SPH) and the lattice Boltzmann method (LBM), are discussed in the present work, although the framework is general enough to be used with any explicit time integration scheme. The implementation of both SPH and the LBM within the parallel framework is outlined, and the performance of each is presented in terms of speed-up and efficiency. On the 24-core server used in this research, near linear scalability was achieved for both numerical methods with utilization efficiencies up to 95%. To close, the framework is employed to simulate fluid flow in a porous rock specimen, which is of broad geophysical significance, particularly in enhanced oil recovery

    A Simulation Suite for Lattice-Boltzmann based Real-Time CFD Applications Exploiting Multi-Level Parallelism on Modern Multi- and Many-Core Architectures

    Get PDF
    We present a software approach to hardware-oriented numerics which builds upon an augmented, previously published open-source set of libraries facilitating portable code development and optimisation on a wide range of modern computer architectures. In order to maximise eficiency, we exploit all levels of arallelism, including vectorisation within CPU cores, the Cell BE and GPUs, shared memory thread-level parallelism between cores, and parallelism between heterogeneous distributed memory resources in clusters. To evaluate and validate our approach, we implement a collection of modular building blocks for the easy and fast assembly and development of CFD applications based on the shallow water equations: We combine the Lattice-Boltzmann method with i-uid-structure interaction techniques in order to achieve real-time simulations targeting interactive virtual environments. Our results demonstrate that recent multi-core CPUs outperform the Cell BE, while GPUs are significantly faster than conventional multi-threaded SSE code. In addition, we verify good scalability properties of our application on small clusters
    • …
    corecore