39 research outputs found
Recommended from our members
How to improve open rotor aerodynamics at cruise and take-off
AbstractA key challenge in open rotor design is getting the optimum aerodynamics at both the cruise and take-off conditions. This is particularly difficult because the operation and the requirements of an open rotor are very different at cruise compared to takeoff. This paper uses CFD results to explore the impact of various design changes on the cruise and take-off flow-fields. The paper then considers how a given open rotor design is best operated at take-off to minimise noise whilst maintaining high thrust. The main findings are that various design modifications can be applied to control the flow features that lead to lost efficiency at cruise and increased noise emission at take-off. A breakdown of the lost power terms from CFD solutions demonstrates how developments in open rotor design have led to reduced aerodynamic losses. At take-off, the operating point of the open rotor should be set such that the non-dimensional lift is as high as possible, without causing significant flow separation. This can be achieved through suitable amounts of re-pitch and speed up applied to a design. Comparisons with fully three-dimensional CFD show that the amount of re-pitch required can be determined using simplified methods such as two-dimensional CFD and a Blade Element Method.This is the accepted manuscript. The final published version is available at http://aerosociety.com/News/Publications/Aero-Journal/Online/2522/How-to-improve-open-rotor-aerodynamics-at-cruise-and-takeoff
An MPI-CUDA Implementation for Massively Parallel Incompressible Flow Computations on Multi-GPU Clusters
Modern graphics processing units (GPUs) with many-core architectures have emerged as general-purpose parallel computing platforms that can accelerate simulation science applications tremendously. While multi-GPU workstations with several TeraFLOPS of peak computing power are available to accelerate computational problems, larger problems require even more resources. Conventional clusters of central processing units (CPU) are now being augmented with multiple GPUs in each compute-node to tackle large problems. The heterogeneous architecture of a multi-GPU cluster with a deep memory hierarchy creates unique challenges in developing scalable and efficient simulation codes. In this study, we pursue mixed MPI-CUDA implementations and investigate three strategies to probe the efficiency and scalability of incompressible flow computations on the Lincoln Tesla cluster at the National Center for Supercomputing Applications (NCSA). We exploit some of the advanced features of MPI and CUDA programming to overlap both GPU data transfer and MPI communications with computations on the GPU. We sustain approximately 2.4 TeraFLOPS on the 64 nodes of the NCSA Lincoln Tesla cluster using 128 GPUs with a total of 30,720 processing elements. Our results demonstrate that multi-GPU clusters can substantially accelerate computational fluid dynamics (CFD) simulations
A Full-Depth Amalgamated Parallel 3D Geometric Multigrid Solver for GPU Clusters
Numerical computations of incompressible flow equations with pressure-based algorithms necessitate the solution of an elliptic Poisson equation, for which multigrid methods are known to be very efficient. In our previous work we presented a dual-level (MPI-CUDA) parallel implementation of the Navier-Stokes equations to simulate buoyancy-driven incompressible fluid flows on GPU clusters with simple iterative methods while focusing on the scalability of the overall solver. In the present study we describe the implementation and performance of a multigrid method to solve the pressure Poisson equation within our MPI-CUDA parallel incompressible flow solver. Various design decisions and algorithmic choices for multigrid methods are explored in light of NVIDIA’s recent Fermi architecture. We discuss how unique aspects of an MPI-CUDA implementation for GPU clusters is related to the software choices made to implement the multigrid method. We propose a new coarse grid solution method of embedded multigrid with amalgamation and show that the parallel implementation retains the numerical efficiency of the multigrid method. Performance measurements on the NCSA Lincoln and TACC Longhorn clusters are presented for up to 64 GPUs
An accelerated 3D Navier-Stokes solver for flows in turbomachines
A new three-dimensional Navier-Stokes solver for flows in turbomachines has been developed. The new solver is based on the latest version of the Denton codes, but has been implemented to run on Graphics Processing Units (GPUs) instead of the traditional Central Processing Unit (CPU). The change in processor enables an order-of-magnitude reduction in run-time due to the higher performance of the GPU. Scaling results for a 16 node GPU cluster are also presented, showing almost linear scaling for typical turbomachinery cases. For validation purposes, a test case consisting of a three-stage turbine with complete hub and casing leakage paths is described. Good agreement is obtained with previously published experimental results. The simulation runs in less than 10 minutes on a cluster with four GPUs. Copyright © 2009 by ASME