619 research outputs found

    High Performance Scientific Computing in Applications with Direct Finite Element Simulation

    Get PDF
    To predict separated flow including stall of a full aircraft with Computational Fluid Dynamics (CFD) is considered one of the problems of the grand challenges to be solved by 2030, according to NASA [1]. The nonlinear Navier- Stokes equations provide the mathematical formulation for fluid flow in 3- dimensional spaces. However, classical solutions, existence, and uniqueness are still missing. Since brute-force computation is intractable, to perform predictive simulation for a full aircraft, one can use Direct Numerical Simulation (DNS); however, it is prohibitively expensive as it needs to resolve the turbulent scales of order Re4 . Considering other methods such as statistical average Reynolds’s Average Navier Stokes (RANS), spatial average Large Eddy Simulation (LES), and hybrid Detached Eddy Simulation (DES), which require less number of degrees of freedom. All of these methods have to be tuned to benchmark problems, and moreover, near the walls, the mesh has to be very fine to resolve boundary layers (which means the computational cost is very expensive). Above all, the results are sensitive to, e.g. explicit parameters in the method, the mesh, etc. As a resolution to the challenge, here we present the adaptive time- resolved Direct FEM Solution (DFS) methodology with numerical tripping, as a predictive, parameter-free family of methods for turbulent flow. We solved the JAXA Standard Model (JSM) aircraft model at realistic Reynolds number, presented as part of the High Lift Prediction Workshop 3. We predicted lift Cl within 5% error vs. experiment, drag Cd within 10% error and stall 1◦ within the angle of attack. The workshop identified a likely experimental error of order 10% for the drag results. The simulation is 10 times faster and cheaper when compared to traditional or existing CFD approaches. The efficiency mainly comes from the slip boundary condition that allows coarse meshes near walls, goal-oriented adaptive error control that refines the mesh only where needed and large time steps using a Schur-type fixed-point iteration method, without compromising the accuracy of the simulation results. As a follow-up, we were invited to the Fifth High Order CFD Workshop, where the approach was validated for a tandem sphere problem (low Reynolds number turbulent flow) wherein a second sphere is placed a certain distance downstream from a first sphere. The results capture the expected slipstream phenomenon, with appx. 2% error. A comparison with the higher-order frameworks Nek500 and PyFR was done. The PyFR framework has demonstrated high effectiveness for GPUs with an unstructured mesh, which is a hard problem in this field. This is achieved by an explicit time-stepping approach. Our study showed that our large time step approach enabled appx. 3 orders of magnitude larger time steps than the explicit time steps in PyFR, which made our method more effective for solving the whole problem. We also presented a generalization of DFS to variable density and validated against the well-established MARIN benchmark problem. The results show good agreement with experimental results in the form of pressure sensors. Later, we used this methodology to solve two applications in multiphase flow problems. One has to do with a flash rainwater storage tank (Bilbao water consortium), and the second is about designing a nozzle for 3D printing. In the flash rainwater storage tank, we predicted that the water height in the tank has a significant influence on how the flow behaves downstream of the tank door (valve). For the 3D printing, we developed an efficient design with the focused jet flow to prevent oxidation and heating at the tip of the nozzle during a melting process. Finally, we presented here the parallelism on multiple GPUs and the embedded system Kalray architecture. Almost all supercomputers today have heterogeneous architectures, such as CPU+GPU or other accelerators, and it is, therefore, essential to develop computational frameworks to take advantage of them. For multiple GPUs, we developed a stencil computation, applied to geological folds simulation. We explored halo computation and used CUDA streams to optimize computation and communication time. The resulting performance gain was 23% for four GPUs with Fermi architecture, and the corresponding improvement obtained on four Kepler GPUs were 47%. The Kalray architecture is designed to have low energy consumption. Here we tested the Jacobi method with different communication strategies. Additionally, visualization is a crucial area when we do scientific simulations. We developed an automated visualization framework, where we could see that task parallelization is more than 10 times faster than data parallelization. We have also used our DFS in the cloud computing setting to validate the simulation against the local cluster simulation. Finally, we recommend the easy pre-processing tool to support DFS simulation.La Caixa 201

    Establishing Best Practices for X-57 Maxwell CFD Database Generation

    Get PDF
    The X-57 Maxwell is NASAs latest electric airplane concept that has been simulated for aerodynamic performance using the structured overset and unstructured grid solvers within the Launch Ascent and Vehicle Aerodynamics (LAVA) solver framework as well as the unstructured polyhedral grid solver in Star-CCM+ for code-to-code comparison. In order to validate the predictions, comparisons were made between the CFD solutions and experimental data collected in the 12-foot Low-Speed Wind Tunnel at NASA Langley Research Center. The simulations are in preparation for the development of a comprehensive aerodynamic database which will assess aircraft performance at a variety of conditions. The findings from these simulations will establish the best practices for mesh resolution, numerical discretization, and turbulence modeling to be used for this database. Preliminary database results have shown that best-practices learned from the initial validation simulations will potentially reduce error in X-57 aerodynamic loads and moments relative to experiment by up to 14%

    Production Level CFD Code Acceleration for Hybrid Many-Core Architectures

    Get PDF
    In this work, a novel graphics processing unit (GPU) distributed sharing model for hybrid many-core architectures is introduced and employed in the acceleration of a production-level computational fluid dynamics (CFD) code. The latest generation graphics hardware allows multiple processor cores to simultaneously share a single GPU through concurrent kernel execution. This feature has allowed the NASA FUN3D code to be accelerated in parallel with up to four processor cores sharing a single GPU. For codes to scale and fully use resources on these and the next generation machines, codes will need to employ some type of GPU sharing model, as presented in this work. Findings include the effects of GPU sharing on overall performance. A discussion of the inherent challenges that parallel unstructured CFD codes face in accelerator-based computing environments is included, with considerations for future generation architectures. This work was completed by the author in August 2010, and reflects the analysis and results of the time

    CFD Analysis and Design Optimization Using Parallel Computers

    Get PDF
    A versatile and efficient multi-block method is presented for the simulation of both steady and unsteady flow, as well as aerodynamic design optimization of complete aircraft configurations. The compressible Euler and Reynolds Averaged Navier-Stokes (RANS) equations are discretized using a high resolution scheme on body-fitted structured meshes. An efficient multigrid implicit scheme is implemented for time-accurate flow calculations. Optimum aerodynamic shape design is achieved at very low cost using an adjoint formulation. The method is implemented on parallel computing systems using the MPI message passing interface standard to ensure portability. The results demonstrate that, by combining highly efficient algorithms with parallel computing, it is possible to perform detailed steady and unsteady analysis as well as automatic design for complex configurations using the present generation of parallel computers
    • …
    corecore