299 research outputs found
Coarsening Strategies for Unstructured Multigrid Techniques with Application to Anisotropic Problems
Over the years, multigrid has been demonstrated as an efficient technique for solving inviscid flow problems. However, for viscous flows, convergence rates often degrade. This is generally due to the required use of stretched meshes (i.e., the aspect ratio AR = Δy/Δx < < 1) in order to capture the boundary layer near the body. Usual techniques for generating a sequence of grids that produce proper convergence rates on isotropic meshes are not adequate for stretched meshes. This work focuses on the solution of Laplace's equation, discretized through a Galerkin finite-element formulation on unstructured stretched triangular meshes. A coarsening strategy is proposed and results are discussed
Semiannual final report, 1 October 1991 - 31 March 1992
A summary of research conducted at the Institute for Computer Applications in Science and Engineering in applied mathematics, numerical analysis, and computer science during the period 1 Oct. 1991 through 31 Mar. 1992 is presented
Toward a GPU-Accelerated Immersed Boundary Method for Wind Forecasting Over Complex Terrain
A short-term wind power forecasting capability can be a valuable tool in the renewable energy industry to address load-balancing issues that arise from intermittent wind fields. Although numerical weather prediction models have been used to forecast winds, their applicability to micro-scale atmospheric boundary layer flows and ability to predict wind speeds at turbine hub height with a desired accuracy is not clear. To address this issue, we develop a multi-GPU parallel flow solver to forecast winds over complex terrain at the micro-scale, where computational domain size can range from meters to several kilometers. In the solver, we adopt the immersed boundary method and the Lagrangian dynamic large-eddy simulation model and extend them to atmospheric flows. The computations are accelerated on GPU clusters with a dual-level parallel implementation that interleaves MPI with CUDA. We evaluate the flow solver components against test problems and obtain preliminary results of flow over Bolund Hill, a coastal hill in Denmark
Adaptive control in rollforward recovery for extreme scale multigrid
With the increasing number of compute components, failures in future
exa-scale computer systems are expected to become more frequent. This motivates
the study of novel resilience techniques. Here, we extend a recently proposed
algorithm-based recovery method for multigrid iterations by introducing an
adaptive control. After a fault, the healthy part of the system continues the
iterative solution process, while the solution in the faulty domain is
re-constructed by an asynchronous on-line recovery. The computations in both
the faulty and healthy subdomains must be coordinated in a sensitive way, in
particular, both under and over-solving must be avoided. Both of these waste
computational resources and will therefore increase the overall
time-to-solution. To control the local recovery and guarantee an optimal
re-coupling, we introduce a stopping criterion based on a mathematical error
estimator. It involves hierarchical weighted sums of residuals within the
context of uniformly refined meshes and is well-suited in the context of
parallel high-performance computing. The re-coupling process is steered by
local contributions of the error estimator. We propose and compare two criteria
which differ in their weights. Failure scenarios when solving up to
unknowns on more than 245\,766 parallel processes will be
reported on a state-of-the-art peta-scale supercomputer demonstrating the
robustness of the method
Doctor of Philosophy
dissertationPartial differential equations (PDEs) are widely used in science and engineering to model phenomena such as sound, heat, and electrostatics. In many practical science and engineering applications, the solutions of PDEs require the tessellation of computational domains into unstructured meshes and entail computationally expensive and time-consuming processes. Therefore, efficient and fast PDE solving techniques on unstructured meshes are important in these applications. Relative to CPUs, the faster growth curves in the speed and greater power efficiency of the SIMD streaming processors, such as GPUs, have gained them an increasingly important role in the high-performance computing area. Combining suitable parallel algorithms and these streaming processors, we can develop very efficient numerical solvers of PDEs. The contributions of this dissertation are twofold: proposal of two general strategies to design efficient PDE solvers on GPUs and the specific applications of these strategies to solve different types of PDEs. Specifically, this dissertation consists of four parts. First, we describe the general strategies, the domain decomposition strategy and the hybrid gathering strategy. Next, we introduce a parallel algorithm for solving the eikonal equation on fully unstructured meshes efficiently. Third, we present the algorithms and data structures necessary to move the entire FEM pipeline to the GPU. Fourth, we propose a parallel algorithm for solving the levelset equation on fully unstructured 2D or 3D meshes or manifolds. This algorithm combines a narrowband scheme with domain decomposition for efficient levelset equation solving
Parallel software tool for decomposing and meshing of 3d structures
An algorithm for automatic parallel generation of three-dimensional unstructured computational meshes based on geometrical domain decomposition is proposed in this paper. Software package build upon proposed algorithm is described. Several practical examples of mesh generation on multiprocessor computational systems are given. It is shown that developed parallel algorithm enables us to reduce mesh generation time significantly (dozens of times). Moreover, it easily produces meshes with number of elements of order 5 · 107, construction of those on a single CPU is problematic. Questions of time consumption, efficiency of computations and quality of generated meshes are also considered
Performance Portable Solid Mechanics via Matrix-Free -Multigrid
Finite element analysis of solid mechanics is a foundational tool of modern
engineering, with low-order finite element methods and assembled sparse
matrices representing the industry standard for implicit analysis. We use
performance models and numerical experiments to demonstrate that high-order
methods greatly reduce the costs to reach engineering tolerances while enabling
effective use of GPUs. We demonstrate the reliability, efficiency, and
scalability of matrix-free -multigrid methods with algebraic multigrid
coarse solvers through large deformation hyperelastic simulations of multiscale
structures. We investigate accuracy, cost, and execution time on multi-node CPU
and GPU systems for moderate to large models using AMD MI250X (OLCF Crusher),
NVIDIA A100 (NERSC Perlmutter), and V100 (LLNL Lassen and OLCF Summit),
resulting in order of magnitude efficiency improvements over a broad range of
model properties and scales. We discuss efficient matrix-free representation of
Jacobians and demonstrate how automatic differentiation enables rapid
development of nonlinear material models without impacting debuggability and
workflows targeting GPUs
Accuracy, Scalability, and Efficiency of Mixed-Element USM3D for Benchmark Three-Dimensional Flows
The unstructured, mixed-element, cell-centered, finite-volume flow solver USM3D is enhanced with new capabilities including parallelization, line generation for general unstructured grids, improved discretization scheme, and optimized iterative solver. The paper reports on the new developments to the flow solver and assesses the accuracy, scalability, and efficiency. The USM3D assessments are conducted using a baseline method and the recent hierarchical adaptive nonlinear iteration method framework. Two benchmark turbulent flows, namely, a subsonic separated flow around a three-dimensional hemisphere-cylinder configuration and a transonic flow around the ONERA M6 wing are considered
Semiannual report, 1 October 1990 - 31 March 1991
Research conducted at the Institute for Computer Applications in Science and Engineering in applied mathematics, numerical analysis, and computer science is summarized
- …