352 research outputs found
HICFD – Highly Efficient Implementation of CFD Codes for HPC Many-Core Architectures
The objective of the German BMBF research project Highly Efficient Implementation
of CFD Codes for HPC Many-Core Architectures (HICFD) is to develop
new methods and tools for the analysis and optimization of the performance
of parallel computational fluid dynamics (CFD) codes on high performance computer
systems with many-core processors. In the work packages of the project it
is investigated how the performance of parallel CFD codes written in C can be increased
by the optimal use of all parallelism levels. On the highest level MPI is
utilized. Furthermore, on the level of the many-core architecture, highly scaling,
hybrid OpenMP/MPI methods are implemented. On the level of the processor cores
the parallel SIMD units provided by modern CPUs are exploited
A Tuned and Scalable Fast Multipole Method as a Preeminent Algorithm for Exascale Systems
Among the algorithms that are likely to play a major role in future exascale
computing, the fast multipole method (FMM) appears as a rising star. Our
previous recent work showed scaling of an FMM on GPU clusters, with problem
sizes in the order of billions of unknowns. That work led to an extremely
parallel FMM, scaling to thousands of GPUs or tens of thousands of CPUs. This
paper reports on a a campaign of performance tuning and scalability studies
using multi-core CPUs, on the Kraken supercomputer. All kernels in the FMM were
parallelized using OpenMP, and a test using 10^7 particles randomly distributed
in a cube showed 78% efficiency on 8 threads. Tuning of the
particle-to-particle kernel using SIMD instructions resulted in 4x speed-up of
the overall algorithm on single-core tests with 10^3 - 10^7 particles. Parallel
scalability was studied in both strong and weak scaling. The strong scaling
test used 10^8 particles and resulted in 93% parallel efficiency on 2048
processes for the non-SIMD code and 54% for the SIMD-optimized code (which was
still 2x faster). The weak scaling test used 10^6 particles per process, and
resulted in 72% efficiency on 32,768 processes, with the largest calculation
taking about 40 seconds to evaluate more than 32 billion unknowns. This work
builds up evidence for our view that FMM is poised to play a leading role in
exascale computing, and we end the paper with a discussion of the features that
make it a particularly favorable algorithm for the emerging heterogeneous and
massively parallel architectural landscape
A high-performance open-source framework for multiphysics simulation and adjoint-based shape and topology optimization
The first part of this thesis presents the advances made in the Open-Source software SU2,
towards transforming it into a high-performance framework for design and optimization of
multiphysics problems. Through this work, and in collaboration with other authors, a tenfold
performance improvement was achieved for some problems. More importantly, problems that
had previously been impossible to solve in SU2, can now be used in numerical optimization
with shape or topology variables. Furthermore, it is now exponentially simpler to study new
multiphysics applications, and to develop new numerical schemes taking advantage of modern
high-performance-computing systems.
In the second part of this thesis, these capabilities allowed the application of topology optimiza-
tion to medium scale fluid-structure interaction problems, using high-fidelity models (nonlinear
elasticity and Reynolds-averaged Navier-Stokes equations), which had not been done before
in the literature. This showed that topology optimization can be used to target aerodynamic
objectives, by tailoring the interaction between fluid and structure. However, it also made ev-
ident the limitations of density-based methods for this type of problem, in particular, reliably
converging to discrete solutions. This was overcome with new strategies to both guarantee and
accelerate (i.e. reduce the overall computational cost) the convergence to discrete solutions in
fluid-structure interaction problems.Open Acces
- …