1,335 research outputs found

    Load Balancing Unstructured Adaptive Grids for CFD Problems

    Get PDF
    Mesh adaption is a powerful tool for efficient unstructured-grid computations but causes load imbalance among processors on a parallel machine. A dynamic load balancing method is presented that balances the workload across all processors with a global view. After each parallel tetrahedral mesh adaption, the method first determines if the new mesh is sufficiently unbalanced to warrant a repartitioning. If so, the adapted mesh is repartitioned, with new partitions assigned to processors so that the redistribution cost is minimized. The new partitions are accepted only if the remapping cost is compensated by the improved load balance. Results indicate that this strategy is effective for large-scale scientific computations on distributed-memory multiprocessors

    Parallel Anisotropic Unstructured Grid Adaptation

    Get PDF
    Computational Fluid Dynamics (CFD) has become critical to the design and analysis of aerospace vehicles. Parallel grid adaptation that resolves multiple scales with anisotropy is identified as one of the challenges in the CFD Vision 2030 Study to increase the capacity and capability of CFD simulation. The Study also cautions that computer architectures are undergoing a radical change and dramatic increases in algorithm concurrency will be required to exploit full performance. This paper reviews four different methods to parallel anisotropic grid generation. They cover both ends of the spectrum: (i) using existing state-of-the-art software optimized for a single core and modifying it for parallel platforms and (ii) designing and implementing scalable software with incomplete, but rapidly maturating functionality. A brief overview for each grid adaptation system is presented in the context of a telescopic approach for multilevel concurrency. These methods employ different approaches to enable parallel execution, which provides a unique opportunity to illustrate the relative behavior of each approach. Qualitative and quantitative metric evaluations are used to draw lessons for future developments in this critical area for parallel CFD simulation

    Achieving High Speed CFD simulations: Optimization, Parallelization, and FPGA Acceleration for the unstructured DLR TAU Code

    Get PDF
    Today, large scale parallel simulations are fundamental tools to handle complex problems. The number of processors in current computation platforms has been recently increased and therefore it is necessary to optimize the application performance and to enhance the scalability of massively-parallel systems. In addition, new heterogeneous architectures, combining conventional processors with specific hardware, like FPGAs, to accelerate the most time consuming functions are considered as a strong alternative to boost the performance. In this paper, the performance of the DLR TAU code is analyzed and optimized. The improvement of the code efficiency is addressed through three key activities: Optimization, parallelization and hardware acceleration. At first, a profiling analysis of the most time-consuming processes of the Reynolds Averaged Navier Stokes flow solver on a three-dimensional unstructured mesh is performed. Then, a study of the code scalability with new partitioning algorithms are tested to show the most suitable partitioning algorithms for the selected applications. Finally, a feasibility study on the application of FPGAs and GPUs for the hardware acceleration of CFD simulations is presented

    Parallel load balancing strategy for Volume-of-Fluid methods on 3-D unstructured meshes

    Get PDF
    © 2016. This version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/l Volume-of-Fluid (VOF) is one of the methods of choice to reproduce the interface motion in the simulation of multi-fluid flows. One of its main strengths is its accuracy in capturing sharp interface geometries, although requiring for it a number of geometric calculations. Under these circumstances, achieving parallel performance on current supercomputers is a must. The main obstacle for the parallelization is that the computing costs are concentrated only in the discrete elements that lie on the interface between fluids. Consequently, if the interface is not homogeneously distributed throughout the domain, standard domain decomposition (DD) strategies lead to imbalanced workload distributions. In this paper, we present a new parallelization strategy for general unstructured VOF solvers, based on a dynamic load balancing process complementary to the underlying DD. Its parallel efficiency has been analyzed and compared to the DD one using up to 1024 CPU-cores on an Intel SandyBridge based supercomputer. The results obtained on the solution of several artificially generated test cases show a speedup of up to similar to 12x with respect to the standard DD, depending on the interface size, the initial distribution and the number of parallel processes engaged. Moreover, the new parallelization strategy presented is of general purpose, therefore, it could be used to parallelize any VOF solver without requiring changes on the coupled flow solver. Finally, note that although designed for the VOF method, our approach could be easily adapted to other interface-capturing methods, such as the Level-Set, which may present similar workload imbalances. (C) 2014 Elsevier Inc. Allrights reserved.Peer ReviewedPostprint (author's final draft

    RIACS

    Get PDF
    Topics considered include: high-performance computing; cognitive and perceptual prostheses (computational aids designed to leverage human abilities); autonomous systems. Also included: development of a 3D unstructured grid code based on a finite volume formulation and applied to the Navier-stokes equations; Cartesian grid methods for complex geometry; multigrid methods for solving elliptic problems on unstructured grids; algebraic non-overlapping domain decomposition methods for compressible fluid flow problems on unstructured meshes; numerical methods for the compressible navier-stokes equations with application to aerodynamic flows; research in aerodynamic shape optimization; S-HARP: a parallel dynamic spectral partitioner; numerical schemes for the Hamilton-Jacobi and level set equations on triangulated domains; application of high-order shock capturing schemes to direct simulation of turbulence; multicast technology; network testbeds; supercomputer consolidation project

    Impact of Load Balancing on Unstructured Adaptive Grid Computations for Distributed-Memory Multiprocessors

    Get PDF
    The computational requirements for an adaptive solution of unsteady problems change as the simulation progresses. This causes workload imbalance among processors on a parallel machine which, in turn, requires significant data movement at runtime. We present a new dynamic load-balancing framework, called JOVE, that balances the workload across all processors with a global view. Whenever the computational mesh is adapted, JOVE is activated to eliminate the load imbalance. JOVE has been implemented on an IBM SP2 distributed-memory machine in MPI for portability. Experimental results for two model meshes demonstrate that mesh adaption with load balancing gives more than a sixfold improvement over one without load balancing. We also show that JOVE gives a 24-fold speedup on 64 processors compared to sequential execution

    Computational Aerodynamics on unstructed meshes

    Get PDF
    New 2D and 3D unstructured-grid based flow solvers have been developed for simulating steady compressible flows for aerodynamic applications. The codes employ the full compressible Euler/Navier-Stokes equations. The Spalart-Al Imaras one equation turbulence model is used to model turbulence effects of flows. The spatial discretisation has been obtained using a cell-centred finite volume scheme on unstructured-grids, consisting of triangles in 2D and of tetrahedral and prismatic elements in 3D. The temporal discretisation has been obtained with an explicit multistage Runge-Kutta scheme. An "inflation" mesh generation technique is introduced to effectively reduce the difficulty in generating highly stretched 2D/3D viscous grids in regions near solid surfaces. The explicit flow method is accelerated by the use of a multigrid method with consideration of the high grid aspect ratio in viscous flow simulations. A solution mesh adaptation technique is incorporated to improve the overall accuracy of the 2D inviscid and viscous flow solutions. The 3D flow solvers are parallelised in a MIMD fashion aimed at a PC cluster system to reduce the computing time for aerodynamic applications. The numerical methods are first applied to several 2D inviscid flow cases, including subsonic flow in a bump channel, transonic flow around a NACA0012 airfoil and transonic flow around the RAE 2822 airfoil to validate the numerical algorithms. The rest of the 2D case studies concentrate on viscous flow simulations including laminar/turbulent flow over a flat plate, transonic turbulent flow over the RAE 2822 airfoil, and low speed turbulent flows in a turbine cascade with massive separations. The results are compared to experimental data to assess the accuracy of the method. The over resolved problem with mesh adaptation on viscous flow simulations is addressed with a two phase mesh reconstruction procedure. The solution convergence rate with the aspect ratio adaptive multigrid method and the direct connectivity based multigrid is assessed in several viscous turbulent flow simulations. Several 3D test cases are presented to validate the numerical algorithms for solving Euler/Navier-Stokes equations. Inviscid flow around the M6 wing airfoil is simulated on the tetrahedron based 3D flow solver with an upwind scheme and spatial second order finite volume method. The efficiency of the multigrid for inviscid flow simulations is examined. The efficiency of the parallelised 3D flow solver and the PC cluster system is assessed with simulations of the same case with different partitioning schemes. The present parallelised 3D flow solvers on the PC cluster system show satisfactory parallel computing performance. Turbulent flows over a flat plate are simulated with the tetrahedron based and prismatic based flow solver to validate the viscous term treatment. Next, simulation of turbulent flow over the M6 wing is carried out with the parallelised 3D flow solvers to demonstrate the overall accuracy of the algorithms and the efficiency of the multigrid method. The results show very good agreement with experimental data. A highly stretched and well-formed computational grid near the solid wall and wake regions is generated with the "inflation" method. The aspect ratio adaptive multigrid displayed a good acceleration rate. Finally, low speed flow around the NREL Phase 11 Wind turbine is simulated and the results are compared to the experimental data

    Parallel Aerodynamic Simulation on Open Workstation Clusters. Department of Aerospace Engineering Report no. 9830

    Get PDF
    The parallel execution of an aerodynamic simulation code on a non-dedicated, heterogeneous cluster of workstations is examined. This type of facility is commonly available to CFD developers and users in academia, industry and government laboratories and is attractive in terms of cost for CFD simulations. However, practical considerations appear at present to be discouraging widespread adoption of this technology. The main obstacles to achieving an efficient, robust parallel CFD capability in a demanding multi-user environment are investigated. A static load-balancing method, which takes account of varying processor speeds, is described. A dynamic re-allocation method to account for varying processor loads has been developed. Use of proprietary management software has facilitated the implementation of the method

    Unstructured Grid Generation Techniques and Software

    Get PDF
    The Workshop on Unstructured Grid Generation Techniques and Software was conducted for NASA to assess its unstructured grid activities, improve the coordination among NASA centers, and promote technology transfer to industry. The proceedings represent contributions from Ames, Langley, and Lewis Research Centers, and the Johnson and Marshall Space Flight Centers. This report is a compilation of the presentations made at the workshop
    corecore