173 research outputs found
Advancing the Multi-Solver Paradigm for Overset CFD Toward Heterogeneous Architectures
A multi-solver, overset, computational fluid dynamics framework is developed
for efficient, large-scale simulation of rotorcraft problems. Two primary features
distinguish the developed framework from the current state of the art. First, the
framework is designed for heterogeneous compute architectures, making use of both
traditional codes run on the Central Processing Unit (CPU) as well as codes run on
the Graphics Processing Unit (GPU). Second, a framework-level implementation of
the Generalized Minimal Residual linear solver is used to consider all meshes from
all solvers in a single linear system. The developed GPU flow solver and framework
are validated against conventional implementations, achieving a 5.35Ă— speedup for
a single GPU compared to 24 CPU cores. Similarly, the overset linear solver is
compared to traditional techniques, demonstrating the same convergence order can
be achieved using as few as half the number of iterations.
Applications of the developed methods are organized into two chapters. First,
the heterogeneous, overset framework is applied to a notional helicopter configuration
based on the ROBIN wind tunnel experiments. A tail rotor and hub are added
to create a challenging case representative of a realistic, full-rotorcraft simulation.
Interactional aerodynamics between the different components are reviewed in detail.
The second application chapter focuses on performance of the overset linear solver
for unsteady applications. The GPU solver is used along with an unstructured code
to simulate laminar flow over a sphere as well as laminar coaxial rotors designed for a
Mars helicopter. In all results, the overset linear solver out-performs the traditional,
de-coupled approach. Conclusions drawn from both the full-rotorcraft and overset
linear solver simulations can have a significant impact on improving modeling of
complex rotorcraft aerodynamics
Acceleration Techniques for Industrial Large Eddy Simulation with High-Order Methods on CPU-GPU Clusters
One of the NASA's 2030 CFD Vision document key finding is that the use of CFD in the aerospace design process is severely limited by the inability to accurately and reliably predict turbulent flows with significant regions of separation. Scale-resolving simulations such as large eddy simulation (LES) are increasingly utilized with more complex problems such as flow over high lift configurations and through aircraft engines. The present work has the overall objective of reducing the computational cost of industrial LES. The high-order flux reconstruction (FR) method is used as the spatial discretization scheme. First, two acceleration techniques are investigated: the p-multigrid algorithm and Mach number preconditioning. The Weiss and Smith low Mach number preconditioner is used together with the p-multigrid method, and the third order explicit Runge-Kutta (RK3) scheme is considered as the smoother to reduce memory requirements. Mach number preconditioning significantly increased the efficiency of the p-multigrid method. For unsteady simulations, the preconditioner helped with the efficiency of the p-multigrid with larger physical time steps. In most steady cases, the preconditioned p-multigrid approach is comparable to or faster than the implicit LU-SGS algorithm and requires less memory, specially for p 2 schemes. An efficient implementation of the FR method is done for modern GPU clusters and the speedup is investigated for different polynomial orders and cell types. Approaches to improve the parallel efficiency of multi-GPU simulations are also studied. The simulation node-hour cost on the Summit supercomputer is reduced by a factor of 50 for hexahedron cells and up to 200 for tetrahedron cells. Two low memory implicit time integration methods are implemented on GPUs: the matrix-free GMRES solver and a novel local GMRES-SGS method. Parametric studies are done to evaluate their performance on LES benchmark cases. On the High-Lift Common Research Model case for the 2021 4th AIAA High-Lift Prediction Workshop, both GPU implicit time methods provide an additional speedup of 14 and 68, respectively, over the GPU explicit time simulation
High-order incompressible computational fluid dynamics on modern hardware architectures
In this thesis, a high-order incompressible Navier-Stokes solver is developed in the
Python-based PyFR framework. The solver is based on the artificial compressibility
formulation with a Flux Reconstruction (FR) discretisation in space and explicit
dual time stepping in time. In order to reduce time to solution, explicit convergence
acceleration techniques are developed and implemented. These techniques include
polynomial multigrid, a novel locally adaptive pseudo-time stepping approach and
novel stability-optimised Runge-Kutta schemes.
Choices regarding the numerical methods and implementation are motivated as
follows. Firstly, high-order FR is selected as the spatial discretisation due to its low
dissipation and ability to work with unstructured meshes of complex geometries. Be-
ing discontinuous, it also allows the majority of computation to be performed locally.
Secondly, convergence acceleration techniques are restricted to explicit methods in
order to retain the spatial locality provided by FR, which allows efficient harnessing
of the massively parallel compute capability of modern hardware. Thirdly, the solver
is implemented in the PyFR framework with cross-platform support such that it can
run on modern heterogeneous systems via an MPI + X model, with X being CUDA,
OpenCL or OpenMP. As such, it is well-placed to remain relevant in an era of rapidly
evolving hardware architectures.
The new software constitutes the first high-order accurate cross-platform imple-
mentation of an incompressible Navier-Stokes solver via artificial compressibility. The
solver and the convergence acceleration techniques are validated for a range of turbu-
lent test cases. Furthermore, performance of the convergence acceleration techniques
is assessed with a 2D cylinder test case, showing speed-up factors of over 20 relative
to global RK4 pseudo-time stepping when all of the technologies are combined. Fi-
nally, a simulation of the DARPA SUBOFF submarine model is undertaken using the
solver and all convergence acceleration techniques. Excellent agreement with previ-
ous studies is obtained, demonstrating that the technology can be used to conduct
high-fidelity implicit Large Eddy Simulation of industrially relevant problems at scale
using hundreds of GPUs.Open Acces
HPC-enabling technologies for high-fidelity combustion simulations
With the increase in computational power in the last decade and the forthcoming Exascale supercomputers, a new horizon in computational modelling and simulation is envisioned in combustion science. Considering the multiscale and multiphysics characteristics of turbulent reacting flows, combustion simulations are considered as one of the most computationally demanding applications running on cutting-edge supercomputers. Exascale computing opens new frontiers for the simulation of combustion systems as more realistic conditions can be achieved with high-fidelity methods. However, an efficient use of these computing architectures requires methodologies that can exploit all levels of parallelism. The efficient utilization of the next generation of supercomputers needs to be considered from a global perspective, that is, involving physical modelling and numerical methods with methodologies based on High-Performance Computing (HPC) and hardware architectures. This review introduces recent developments in numerical methods for large-eddy simulations (LES) and direct-numerical simulations (DNS) to simulate combustion systems, with focus on the computational performance and algorithmic capabilities. Due to the broad scope, a first section is devoted to describe the fundamentals of turbulent combustion, which is followed by a general description of state-of-the-art computational strategies for solving these problems. These applications require advanced HPC approaches to exploit modern supercomputers, which is addressed in the third section. The increasing complexity of new computing architectures, with tightly coupled CPUs and GPUs, as well as high levels of parallelism, requires new parallel models and algorithms exposing the required level of concurrency. Advances in terms of dynamic load balancing, vectorization, GPU acceleration and mesh adaptation have permitted to achieve highly-efficient combustion simulations with data-driven methods in HPC environments. Therefore, dedicated sections covering the use of high-order methods for reacting flows, integration of detailed chemistry and two-phase flows are addressed. Final remarks and directions of future work are given at the end.
}The research leading to these results has received funding from the European Union’s Horizon 2020 Programme under the CoEC project, grant agreement No. 952181 and the CoE RAISE project grant agreement no. 951733.Peer ReviewedPostprint (published version
A GPU-ACCELERATED, HYBRID FVM-RANS METHODOLOGY FOR MODELING ROTORCRAFT BROWNOUT
A numerically effecient, hybrid Eulerian-
Lagrangian methodology has been developed to
help better understand the complicated two-
phase flowfield encountered in rotorcraft
brownout environments. The problem of brownout
occurs when rotorcraft operate close to
surfaces covered with loose particles such as
sand, dust or snow. These particles can get
entrained, in large quantities, into the rotor
wake leading to a potentially hazardous
degradation of the pilots visibility. It is
believed that a computationally efficient model
of this phenomena, validated against available
experimental measurements, can be a used as a
valuable tool to reveal the underlying physics
of rotorcraft brownout. The present work
involved the design, development and validation
of a hybrid solver for the purpose of modeling
brownout-like environments. The proposed
methodology combines the numerical efficiency
of a free-vortex method with the relatively
high-fidelity of a 3D, time-accurate, Reynolds-
averaged, Navier-Stokes (RANS) solver. For
dual-phase simulations, this hybrid method can
be unidirectionally coupled with a sediment
tracking algorithm to study cloud development.
In the past, large clusters of CPUs have been
the standard approach for large simulations
involving the numerical solution of PDEs. In
recent years, however, an emerging trend is the
use of Graphics Processing Units (GPUs), once
used only for graphics rendering, to perform
scientific computing. These platforms deliver
superior computing power and memory bandwidth
compared to traditional CPUs and their prowess
continues to grow rapidly with each passing
generation. CFD simulations have been ported
successfully onto GPU platforms in the past.
However, the nature of GPU architecture has
restricted the set of algorithms that exhibit
significant speedups on these platforms - GPUs
are optimized for operations where a massively
large number of threads, relative to the
problem size, are working in parallel,
executing identical instructions on disparate
datasets. For this reason, most implementations
in the scientific literature involve the use of
explicit algorithms for time-stepping,
reconstruction, etc. To overcome the difficulty
associated with implicit methods, the current
work proposes a multi-granular approach to
reduce performance penalties typically
encountered with such schemes. To explore the
use of GPUs for RANS simulations, a 3D, time-
accurate, implicit, structured, compressible,
viscous, turbulent, finite-volume RANS solver
was designed and developed in CUDA-C. During
the development phase, various strategies for
performance optimization were used to make the
implementation better suited to the GPU
architecture. Validation and verification of
the GPU-based solver was performed for both
canonical and realistic bench-mark problems on
a variety of GPU platforms. In these test-
cases, a performance assessment of the GPU-RANS
solver indicated that it was between one and
two orders of magnitude faster than equivalent
single CPU core computations ( as high as 50X
for fine-grain computations on the latest
platforms). For simulations involving implicit
methods, a multi-granular technique was used
that sought to exploit the intermediate coarse-
grain parallelism inherent in families of line-
parallel methods like Alternating Direction
Implicit (ADI) schemes coupled with con-
servative variable parallelism. This approach
had the dual effect of reducing memory
bandwidth usage as well as increasing GPU
occupancy leading to significant performance
gains. The multi-granular approach for implicit
methods used in this work has demonstrated
speedups that are close to 50% of those
expected with purely explicit methods. The
validated GPU-RANS solver was then coupled with
GPU-based free-vortex and sediment tracking
methods to model single and dual-phase, model-
scale brownout environments. A qualitative and
quantitative validation of the methodology was
performed by comparing predictions with
available measurements, including flowfield
measurements and observations of particle
transport mechanisms that have been made with
laboratory-scale rotor/jet configurations in
ground effect. In particular, dual-phase
simulations were able to resolve key transport
phenomena in the dispersed phase such as creep,
vortex trapping and sediment wave formation.
Furthermore, these simulations were
demonstrated to be computationally more
efficient than equivalent computations on a
cluster of traditional CPUs - a model-scale
brownout simulation using the hybrid approach
on a single GTX Titan now takes 1.25 hours per
revolution compared to 6 hours per revolution
on 32 Intel Xeon cores
High-Performance Computing: Dos and Don’ts
Computational fluid dynamics (CFD) is the main field of computational mechanics that has historically benefited from advances in high-performance computing. High-performance computing involves several techniques to make a simulation efficient and fast, such as distributed memory parallelism, shared memory parallelism, vectorization, memory access optimizations, etc. As an introduction, we present the anatomy of supercomputers, with special emphasis on HPC aspects relevant to CFD. Then, we develop some of the HPC concepts and numerical techniques applied to the complete CFD simulation framework: from preprocess (meshing) to postprocess (visualization) through the simulation itself (assembly and iterative solvers)
CFD Vision 2030 Study: A Path to Revolutionary Computational Aerosciences
This report documents the results of a study to address the long range, strategic planning required by NASA's Revolutionary Computational Aerosciences (RCA) program in the area of computational fluid dynamics (CFD), including future software and hardware requirements for High Performance Computing (HPC). Specifically, the "Vision 2030" CFD study is to provide a knowledge-based forecast of the future computational capabilities required for turbulent, transitional, and reacting flow simulations across a broad Mach number regime, and to lay the foundation for the development of a future framework and/or environment where physics-based, accurate predictions of complex turbulent flows, including flow separation, can be accomplished routinely and efficiently in cooperation with other physics-based simulations to enable multi-physics analysis and design. Specific technical requirements from the aerospace industrial and scientific communities were obtained to determine critical capability gaps, anticipated technical challenges, and impediments to achieving the target CFD capability in 2030. A preliminary development plan and roadmap were created to help focus investments in technology development to help achieve the CFD vision in 2030
09251 Abstracts Collection -- Scientific Visualization
From 06-14-2009 to 06-19-2009, the Dagstuhl Seminar 09251 ``Scientific Visualization \u27\u27 was held in Schloss Dagstuhl~--~Leibniz Center for Informatics.
During the seminar, over 50 international participants presented their current
research, and ongoing work and open problems were discussed. Abstracts of
the presentations given during the seminar as well as abstracts of
seminar results and ideas are put together in this paper. The first section
describes the seminar topics and goals in general
- …