870 research outputs found
A Flexible Patch-Based Lattice Boltzmann Parallelization Approach for Heterogeneous GPU-CPU Clusters
Sustaining a large fraction of single GPU performance in parallel
computations is considered to be the major problem of GPU-based clusters. In
this article, this topic is addressed in the context of a lattice Boltzmann
flow solver that is integrated in the WaLBerla software framework. We propose a
multi-GPU implementation using a block-structured MPI parallelization, suitable
for load balancing and heterogeneous computations on CPUs and GPUs. The
overhead required for multi-GPU simulations is discussed in detail and it is
demonstrated that the kernel performance can be sustained to a large extent.
With our GPU implementation, we achieve nearly perfect weak scalability on
InfiniBand clusters. However, in strong scaling scenarios multi-GPUs make less
efficient use of the hardware than IBM BG/P and x86 clusters. Hence, a cost
analysis must determine the best course of action for a particular simulation
task. Additionally, weak scaling results of heterogeneous simulations conducted
on CPUs and GPUs simultaneously are presented using clusters equipped with
varying node configurations.Comment: 20 pages, 12 figure
Dynamic Load Balancing Techniques for Particulate Flow Simulations
Parallel multiphysics simulations often suffer from load imbalances
originating from the applied coupling of algorithms with spatially and
temporally varying workloads. It is thus desirable to minimize these imbalances
to reduce the time to solution and to better utilize the available hardware
resources. Taking particulate flows as an illustrating example application, we
present and evaluate load balancing techniques that tackle this challenging
task. This involves a load estimation step in which the currently generated
workload is predicted. We describe in detail how such a workload estimator can
be developed. In a second step, load distribution strategies like space-filling
curves or graph partitioning are applied to dynamically distribute the load
among the available processes. To compare and analyze their performance, we
employ these techniques to a benchmark scenario and observe a reduction of the
load imbalances by almost a factor of four. This results in a decrease of the
overall runtime by 14% for space-filling curves
Steering in computational science: mesoscale modelling and simulation
This paper outlines the benefits of computational steering for high
performance computing applications. Lattice-Boltzmann mesoscale fluid
simulations of binary and ternary amphiphilic fluids in two and three
dimensions are used to illustrate the substantial improvements which
computational steering offers in terms of resource efficiency and time to
discover new physics. We discuss details of our current steering
implementations and describe their future outlook with the advent of
computational grids.Comment: 40 pages, 11 figures. Accepted for publication in Contemporary
Physic
- …