Search CORE

352 research outputs found

HICFD – Highly Efficient Implementation of CFD Codes for HPC Many-Core Architectures

Author: Alrutz Thomas
Aumann Petra
Backhaus Jan
Basermann Achim
Feldhoff Kim
Gerhold Thomas
Hunger Jörg Hunger
Jägersküpper Jens
Kersken Hans-Peter
Knobloch Olaf
Kroll Norbert
Krzikalla Olaf
Kügeler Edmund
Müller-Pfefferkorn Ralph
Puetz Mathias
Schreiber Andreas
Simmendinger Christian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/02/2012
Field of study

The objective of the German BMBF research project Highly Efficient Implementation of CFD Codes for HPC Many-Core Architectures (HICFD) is to develop new methods and tools for the analysis and optimization of the performance of parallel computational fluid dynamics (CFD) codes on high performance computer systems with many-core processors. In the work packages of the project it is investigated how the performance of parallel CFD codes written in C can be increased by the optimal use of all parallelism levels. On the highest level MPI is utilized. Furthermore, on the level of the many-core architecture, highly scaling, hybrid OpenMP/MPI methods are implemented. On the level of the processor cores the parallel SIMD units provided by modern CPUs are exploited

Institute of Transport Research:Publications

A Tuned and Scalable Fast Multipole Method as a Preeminent Algorithm for Exascale Systems

Author: Bergman K
Chandramowlishwaran A
Hamada T
Lorena A Barba
Rahimian A
Rio Yokota
Warren M
Yokota R
Publication venue: 'SAGE Publications'
Publication date: 16/10/2011
Field of study

Among the algorithms that are likely to play a major role in future exascale computing, the fast multipole method (FMM) appears as a rising star. Our previous recent work showed scaling of an FMM on GPU clusters, with problem sizes in the order of billions of unknowns. That work led to an extremely parallel FMM, scaling to thousands of GPUs or tens of thousands of CPUs. This paper reports on a a campaign of performance tuning and scalability studies using multi-core CPUs, on the Kraken supercomputer. All kernels in the FMM were parallelized using OpenMP, and a test using 10^7 particles randomly distributed in a cube showed 78% efficiency on 8 threads. Tuning of the particle-to-particle kernel using SIMD instructions resulted in 4x speed-up of the overall algorithm on single-core tests with 10^3 - 10^7 particles. Parallel scalability was studied in both strong and weak scaling. The strong scaling test used 10^8 particles and resulted in 93% parallel efficiency on 2048 processes for the non-SIMD code and 54% for the SIMD-optimized code (which was still 2x faster). The weak scaling test used 10^6 particles per process, and resulted in 72% efficiency on 32,768 processes, with the largest calculation taking about 40 seconds to evaluate more than 32 billion unknowns. This work builds up evidence for our view that FMM is poised to play a leading role in exascale computing, and we end the paper with a discussion of the features that make it a particularly favorable algorithm for the emerging heterogeneous and massively parallel architectural landscape

arXiv.org e-Print Archive

Crossref

On the Porting and Optimisation of Physics Simulations for Heterogeneous Parallel Processors

Author: Martineau Matt J
Publication venue
Publication date: 25/06/2019
Field of study

Explore Bristol Research

Symbolic crosschecking of data-parallel floating-point code

Author: Cadar
collingbourne
Kelly PHJ
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/12/2013
Field of study

Spiral - Imperial College Digital Repository

A high-performance open-source framework for multiphysics simulation and adjoint-based shape and topology optimization

Author: Carrusca Gomes Pedro
Publication venue: Aeronautics, Imperial College London
Publication date: 01/02/2022
Field of study

The first part of this thesis presents the advances made in the Open-Source software SU2, towards transforming it into a high-performance framework for design and optimization of multiphysics problems. Through this work, and in collaboration with other authors, a tenfold performance improvement was achieved for some problems. More importantly, problems that had previously been impossible to solve in SU2, can now be used in numerical optimization with shape or topology variables. Furthermore, it is now exponentially simpler to study new multiphysics applications, and to develop new numerical schemes taking advantage of modern high-performance-computing systems. In the second part of this thesis, these capabilities allowed the application of topology optimiza- tion to medium scale fluid-structure interaction problems, using high-fidelity models (nonlinear elasticity and Reynolds-averaged Navier-Stokes equations), which had not been done before in the literature. This showed that topology optimization can be used to target aerodynamic objectives, by tailoring the interaction between fluid and structure. However, it also made ev- ident the limitations of density-based methods for this type of problem, in particular, reliably converging to discrete solutions. This was overcome with new strategies to both guarantee and accelerate (i.e. reduce the overall computational cost) the convergence to discrete solutions in fluid-structure interaction problems.Open Acces

Spiral - Imperial College Digital Repository