173 research outputs found

    Advancing the Multi-Solver Paradigm for Overset CFD Toward Heterogeneous Architectures

    Get PDF
    A multi-solver, overset, computational fluid dynamics framework is developed for efficient, large-scale simulation of rotorcraft problems. Two primary features distinguish the developed framework from the current state of the art. First, the framework is designed for heterogeneous compute architectures, making use of both traditional codes run on the Central Processing Unit (CPU) as well as codes run on the Graphics Processing Unit (GPU). Second, a framework-level implementation of the Generalized Minimal Residual linear solver is used to consider all meshes from all solvers in a single linear system. The developed GPU flow solver and framework are validated against conventional implementations, achieving a 5.35Ă— speedup for a single GPU compared to 24 CPU cores. Similarly, the overset linear solver is compared to traditional techniques, demonstrating the same convergence order can be achieved using as few as half the number of iterations. Applications of the developed methods are organized into two chapters. First, the heterogeneous, overset framework is applied to a notional helicopter configuration based on the ROBIN wind tunnel experiments. A tail rotor and hub are added to create a challenging case representative of a realistic, full-rotorcraft simulation. Interactional aerodynamics between the different components are reviewed in detail. The second application chapter focuses on performance of the overset linear solver for unsteady applications. The GPU solver is used along with an unstructured code to simulate laminar flow over a sphere as well as laminar coaxial rotors designed for a Mars helicopter. In all results, the overset linear solver out-performs the traditional, de-coupled approach. Conclusions drawn from both the full-rotorcraft and overset linear solver simulations can have a significant impact on improving modeling of complex rotorcraft aerodynamics

    Acceleration Techniques for Industrial Large Eddy Simulation with High-Order Methods on CPU-GPU Clusters

    Get PDF
    One of the NASA's 2030 CFD Vision document key finding is that the use of CFD in the aerospace design process is severely limited by the inability to accurately and reliably predict turbulent flows with significant regions of separation. Scale-resolving simulations such as large eddy simulation (LES) are increasingly utilized with more complex problems such as flow over high lift configurations and through aircraft engines. The present work has the overall objective of reducing the computational cost of industrial LES. The high-order flux reconstruction (FR) method is used as the spatial discretization scheme. First, two acceleration techniques are investigated: the p-multigrid algorithm and Mach number preconditioning. The Weiss and Smith low Mach number preconditioner is used together with the p-multigrid method, and the third order explicit Runge-Kutta (RK3) scheme is considered as the smoother to reduce memory requirements. Mach number preconditioning significantly increased the efficiency of the p-multigrid method. For unsteady simulations, the preconditioner helped with the efficiency of the p-multigrid with larger physical time steps. In most steady cases, the preconditioned p-multigrid approach is comparable to or faster than the implicit LU-SGS algorithm and requires less memory, specially for p 2 schemes. An efficient implementation of the FR method is done for modern GPU clusters and the speedup is investigated for different polynomial orders and cell types. Approaches to improve the parallel efficiency of multi-GPU simulations are also studied. The simulation node-hour cost on the Summit supercomputer is reduced by a factor of 50 for hexahedron cells and up to 200 for tetrahedron cells. Two low memory implicit time integration methods are implemented on GPUs: the matrix-free GMRES solver and a novel local GMRES-SGS method. Parametric studies are done to evaluate their performance on LES benchmark cases. On the High-Lift Common Research Model case for the 2021 4th AIAA High-Lift Prediction Workshop, both GPU implicit time methods provide an additional speedup of 14 and 68, respectively, over the GPU explicit time simulation

    High-order incompressible computational fluid dynamics on modern hardware architectures

    Get PDF
    In this thesis, a high-order incompressible Navier-Stokes solver is developed in the Python-based PyFR framework. The solver is based on the artificial compressibility formulation with a Flux Reconstruction (FR) discretisation in space and explicit dual time stepping in time. In order to reduce time to solution, explicit convergence acceleration techniques are developed and implemented. These techniques include polynomial multigrid, a novel locally adaptive pseudo-time stepping approach and novel stability-optimised Runge-Kutta schemes. Choices regarding the numerical methods and implementation are motivated as follows. Firstly, high-order FR is selected as the spatial discretisation due to its low dissipation and ability to work with unstructured meshes of complex geometries. Be- ing discontinuous, it also allows the majority of computation to be performed locally. Secondly, convergence acceleration techniques are restricted to explicit methods in order to retain the spatial locality provided by FR, which allows efficient harnessing of the massively parallel compute capability of modern hardware. Thirdly, the solver is implemented in the PyFR framework with cross-platform support such that it can run on modern heterogeneous systems via an MPI + X model, with X being CUDA, OpenCL or OpenMP. As such, it is well-placed to remain relevant in an era of rapidly evolving hardware architectures. The new software constitutes the first high-order accurate cross-platform imple- mentation of an incompressible Navier-Stokes solver via artificial compressibility. The solver and the convergence acceleration techniques are validated for a range of turbu- lent test cases. Furthermore, performance of the convergence acceleration techniques is assessed with a 2D cylinder test case, showing speed-up factors of over 20 relative to global RK4 pseudo-time stepping when all of the technologies are combined. Fi- nally, a simulation of the DARPA SUBOFF submarine model is undertaken using the solver and all convergence acceleration techniques. Excellent agreement with previ- ous studies is obtained, demonstrating that the technology can be used to conduct high-fidelity implicit Large Eddy Simulation of industrially relevant problems at scale using hundreds of GPUs.Open Acces

    HPC-enabling technologies for high-fidelity combustion simulations

    Get PDF
    With the increase in computational power in the last decade and the forthcoming Exascale supercomputers, a new horizon in computational modelling and simulation is envisioned in combustion science. Considering the multiscale and multiphysics characteristics of turbulent reacting flows, combustion simulations are considered as one of the most computationally demanding applications running on cutting-edge supercomputers. Exascale computing opens new frontiers for the simulation of combustion systems as more realistic conditions can be achieved with high-fidelity methods. However, an efficient use of these computing architectures requires methodologies that can exploit all levels of parallelism. The efficient utilization of the next generation of supercomputers needs to be considered from a global perspective, that is, involving physical modelling and numerical methods with methodologies based on High-Performance Computing (HPC) and hardware architectures. This review introduces recent developments in numerical methods for large-eddy simulations (LES) and direct-numerical simulations (DNS) to simulate combustion systems, with focus on the computational performance and algorithmic capabilities. Due to the broad scope, a first section is devoted to describe the fundamentals of turbulent combustion, which is followed by a general description of state-of-the-art computational strategies for solving these problems. These applications require advanced HPC approaches to exploit modern supercomputers, which is addressed in the third section. The increasing complexity of new computing architectures, with tightly coupled CPUs and GPUs, as well as high levels of parallelism, requires new parallel models and algorithms exposing the required level of concurrency. Advances in terms of dynamic load balancing, vectorization, GPU acceleration and mesh adaptation have permitted to achieve highly-efficient combustion simulations with data-driven methods in HPC environments. Therefore, dedicated sections covering the use of high-order methods for reacting flows, integration of detailed chemistry and two-phase flows are addressed. Final remarks and directions of future work are given at the end. }The research leading to these results has received funding from the European Union’s Horizon 2020 Programme under the CoEC project, grant agreement No. 952181 and the CoE RAISE project grant agreement no. 951733.Peer ReviewedPostprint (published version

    A GPU-ACCELERATED, HYBRID FVM-RANS METHODOLOGY FOR MODELING ROTORCRAFT BROWNOUT

    Get PDF
    A numerically effecient, hybrid Eulerian- Lagrangian methodology has been developed to help better understand the complicated two- phase flowfield encountered in rotorcraft brownout environments. The problem of brownout occurs when rotorcraft operate close to surfaces covered with loose particles such as sand, dust or snow. These particles can get entrained, in large quantities, into the rotor wake leading to a potentially hazardous degradation of the pilots visibility. It is believed that a computationally efficient model of this phenomena, validated against available experimental measurements, can be a used as a valuable tool to reveal the underlying physics of rotorcraft brownout. The present work involved the design, development and validation of a hybrid solver for the purpose of modeling brownout-like environments. The proposed methodology combines the numerical efficiency of a free-vortex method with the relatively high-fidelity of a 3D, time-accurate, Reynolds- averaged, Navier-Stokes (RANS) solver. For dual-phase simulations, this hybrid method can be unidirectionally coupled with a sediment tracking algorithm to study cloud development. In the past, large clusters of CPUs have been the standard approach for large simulations involving the numerical solution of PDEs. In recent years, however, an emerging trend is the use of Graphics Processing Units (GPUs), once used only for graphics rendering, to perform scientific computing. These platforms deliver superior computing power and memory bandwidth compared to traditional CPUs and their prowess continues to grow rapidly with each passing generation. CFD simulations have been ported successfully onto GPU platforms in the past. However, the nature of GPU architecture has restricted the set of algorithms that exhibit significant speedups on these platforms - GPUs are optimized for operations where a massively large number of threads, relative to the problem size, are working in parallel, executing identical instructions on disparate datasets. For this reason, most implementations in the scientific literature involve the use of explicit algorithms for time-stepping, reconstruction, etc. To overcome the difficulty associated with implicit methods, the current work proposes a multi-granular approach to reduce performance penalties typically encountered with such schemes. To explore the use of GPUs for RANS simulations, a 3D, time- accurate, implicit, structured, compressible, viscous, turbulent, finite-volume RANS solver was designed and developed in CUDA-C. During the development phase, various strategies for performance optimization were used to make the implementation better suited to the GPU architecture. Validation and verification of the GPU-based solver was performed for both canonical and realistic bench-mark problems on a variety of GPU platforms. In these test- cases, a performance assessment of the GPU-RANS solver indicated that it was between one and two orders of magnitude faster than equivalent single CPU core computations ( as high as 50X for fine-grain computations on the latest platforms). For simulations involving implicit methods, a multi-granular technique was used that sought to exploit the intermediate coarse- grain parallelism inherent in families of line- parallel methods like Alternating Direction Implicit (ADI) schemes coupled with con- servative variable parallelism. This approach had the dual effect of reducing memory bandwidth usage as well as increasing GPU occupancy leading to significant performance gains. The multi-granular approach for implicit methods used in this work has demonstrated speedups that are close to 50% of those expected with purely explicit methods. The validated GPU-RANS solver was then coupled with GPU-based free-vortex and sediment tracking methods to model single and dual-phase, model- scale brownout environments. A qualitative and quantitative validation of the methodology was performed by comparing predictions with available measurements, including flowfield measurements and observations of particle transport mechanisms that have been made with laboratory-scale rotor/jet configurations in ground effect. In particular, dual-phase simulations were able to resolve key transport phenomena in the dispersed phase such as creep, vortex trapping and sediment wave formation. Furthermore, these simulations were demonstrated to be computationally more efficient than equivalent computations on a cluster of traditional CPUs - a model-scale brownout simulation using the hybrid approach on a single GTX Titan now takes 1.25 hours per revolution compared to 6 hours per revolution on 32 Intel Xeon cores

    High-Performance Computing: Dos and Don’ts

    Get PDF
    Computational fluid dynamics (CFD) is the main field of computational mechanics that has historically benefited from advances in high-performance computing. High-performance computing involves several techniques to make a simulation efficient and fast, such as distributed memory parallelism, shared memory parallelism, vectorization, memory access optimizations, etc. As an introduction, we present the anatomy of supercomputers, with special emphasis on HPC aspects relevant to CFD. Then, we develop some of the HPC concepts and numerical techniques applied to the complete CFD simulation framework: from preprocess (meshing) to postprocess (visualization) through the simulation itself (assembly and iterative solvers)

    CFD Vision 2030 Study: A Path to Revolutionary Computational Aerosciences

    Get PDF
    This report documents the results of a study to address the long range, strategic planning required by NASA's Revolutionary Computational Aerosciences (RCA) program in the area of computational fluid dynamics (CFD), including future software and hardware requirements for High Performance Computing (HPC). Specifically, the "Vision 2030" CFD study is to provide a knowledge-based forecast of the future computational capabilities required for turbulent, transitional, and reacting flow simulations across a broad Mach number regime, and to lay the foundation for the development of a future framework and/or environment where physics-based, accurate predictions of complex turbulent flows, including flow separation, can be accomplished routinely and efficiently in cooperation with other physics-based simulations to enable multi-physics analysis and design. Specific technical requirements from the aerospace industrial and scientific communities were obtained to determine critical capability gaps, anticipated technical challenges, and impediments to achieving the target CFD capability in 2030. A preliminary development plan and roadmap were created to help focus investments in technology development to help achieve the CFD vision in 2030

    09251 Abstracts Collection -- Scientific Visualization

    Get PDF
    From 06-14-2009 to 06-19-2009, the Dagstuhl Seminar 09251 ``Scientific Visualization \u27\u27 was held in Schloss Dagstuhl~--~Leibniz Center for Informatics. During the seminar, over 50 international participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general
    • …
    corecore