Search CORE

299 research outputs found

Coarsening Strategies for Unstructured Multigrid Techniques with Application to Anisotropic Problems

Author: Mavriplis D. J.
Morano E.
Venkatakrishnan V.
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 10/09/1998
Field of study

Over the years, multigrid has been demonstrated as an efficient technique for solving inviscid flow problems. However, for viscous flows, convergence rates often degrade. This is generally due to the required use of stretched meshes (i.e., the aspect ratio AR = Δy/Δx < < 1) in order to capture the boundary layer near the body. Usual techniques for generating a sequence of grids that produce proper convergence rates on isotropic meshes are not adequate for stretched meshes. This work focuses on the solution of Laplace's equation, discretized through a Galerkin finite-element formulation on unstructured stretched triangular meshes. A coarsening strategy is proposed and results are discussed

Caltech Authors

Semiannual final report, 1 October 1991 - 31 March 1992

Author
Publication venue
Publication date
Field of study

A summary of research conducted at the Institute for Computer Applications in Science and Engineering in applied mathematics, numerical analysis, and computer science during the period 1 Oct. 1991 through 31 Mar. 1992 is presented

NASA Technical Reports Server

Toward a GPU-Accelerated Immersed Boundary Method for Wind Forecasting Over Complex Terrain

Author: DeLeon Rey
Felzien Kyle
Senocak Inanc
Publication venue: 'IUScholarWorks'
Publication date: 08/07/2012
Field of study

A short-term wind power forecasting capability can be a valuable tool in the renewable energy industry to address load-balancing issues that arise from intermittent wind fields. Although numerical weather prediction models have been used to forecast winds, their applicability to micro-scale atmospheric boundary layer flows and ability to predict wind speeds at turbine hub height with a desired accuracy is not clear. To address this issue, we develop a multi-GPU parallel flow solver to forecast winds over complex terrain at the micro-scale, where computational domain size can range from meters to several kilometers. In the solver, we adopt the immersed boundary method and the Lagrangian dynamic large-eddy simulation model and extend them to atmospheric flows. The computations are accelerated on GPU clusters with a dual-level parallel implementation that interleaves MPI with CUDA. We evaluate the flow solver components against test problems and obtain preliminary results of flow over Bolund Hill, a coastal hill in Denmark

Crossref

Boise State University - ScholarWorks

Adaptive control in rollforward recovery for extreme scale multigrid

Author: Huber Markus
Rüde Ulrich
Wohlmuth Barbara
Publication venue
Publication date: 01/01/2018
Field of study

With the increasing number of compute components, failures in future exa-scale computer systems are expected to become more frequent. This motivates the study of novel resilience techniques. Here, we extend a recently proposed algorithm-based recovery method for multigrid iterations by introducing an adaptive control. After a fault, the healthy part of the system continues the iterative solution process, while the solution in the faulty domain is re-constructed by an asynchronous on-line recovery. The computations in both the faulty and healthy subdomains must be coordinated in a sensitive way, in particular, both under and over-solving must be avoided. Both of these waste computational resources and will therefore increase the overall time-to-solution. To control the local recovery and guarantee an optimal re-coupling, we introduce a stopping criterion based on a mathematical error estimator. It involves hierarchical weighted sums of residuals within the context of uniformly refined meshes and is well-suited in the context of parallel high-performance computing. The re-coupling process is steered by local contributions of the error estimator. We propose and compare two criteria which differ in their weights. Failure scenarios when solving up to

6.9\cdot10^{11}

unknowns on more than 245\,766 parallel processes will be reported on a state-of-the-art peta-scale supercomputer demonstrating the robustness of the method

arXiv.org e-Print Archive

Juelich Shared Electronic Resources

Doctor of Philosophy

Author: Fu Zhisong
Publication venue: University of Utah
Publication date: 01/12/2013
Field of study

dissertationPartial differential equations (PDEs) are widely used in science and engineering to model phenomena such as sound, heat, and electrostatics. In many practical science and engineering applications, the solutions of PDEs require the tessellation of computational domains into unstructured meshes and entail computationally expensive and time-consuming processes. Therefore, efficient and fast PDE solving techniques on unstructured meshes are important in these applications. Relative to CPUs, the faster growth curves in the speed and greater power efficiency of the SIMD streaming processors, such as GPUs, have gained them an increasingly important role in the high-performance computing area. Combining suitable parallel algorithms and these streaming processors, we can develop very efficient numerical solvers of PDEs. The contributions of this dissertation are twofold: proposal of two general strategies to design efficient PDE solvers on GPUs and the specific applications of these strategies to solve different types of PDEs. Specifically, this dissertation consists of four parts. First, we describe the general strategies, the domain decomposition strategy and the hybrid gathering strategy. Next, we introduce a parallel algorithm for solving the eikonal equation on fully unstructured meshes efficiently. Third, we present the algorithms and data structures necessary to move the entire FEM pipeline to the GPU. Fourth, we propose a parallel algorithm for solving the levelset equation on fully unstructured 2D or 3D meshes or manifolds. This algorithm combines a narrowband scheme with domain decomposition for efficient levelset equation solving

The University of Utah: J. Willard Marriott Digital Library

Parallel software tool for decomposing and meshing of 3d structures

Author: Andrä H.
Gluchshenko O.
Ivanov E.
Kudryavtsev A.
Publication venue
Publication date: 01/01/2007
Field of study

An algorithm for automatic parallel generation of three-dimensional unstructured computational meshes based on geometrical domain decomposition is proposed in this paper. Software package build upon proposed algorithm is described. Several practical examples of mesh generation on multiprocessor computational systems are given. It is shown that developed parallel algorithm enables us to reduce mesh generation time significantly (dozens of times). Moreover, it easily produces meshes with number of elements of order 5 · 107, construction of those on a single CPU is problematic. Questions of time consumption, efficiency of computations and quality of generated meshes are also considered

Kaiserslauterer uniweiter elektronischer Dokumentenserver

Performance Portable Solid Mechanics via Matrix-Free $p$ -Multigrid

Author: Barra Valeria
Beams Natalie
Brown Jed
Ghaffari Leila
Knepley Matthew
Moses William
Shakeri Rezgar
Stengel Karen
Thompson Jeremy L.
Zhang Junchao
Publication venue
Publication date: 04/04/2022
Field of study

Finite element analysis of solid mechanics is a foundational tool of modern engineering, with low-order finite element methods and assembled sparse matrices representing the industry standard for implicit analysis. We use performance models and numerical experiments to demonstrate that high-order methods greatly reduce the costs to reach engineering tolerances while enabling effective use of GPUs. We demonstrate the reliability, efficiency, and scalability of matrix-free

p

-multigrid methods with algebraic multigrid coarse solvers through large deformation hyperelastic simulations of multiscale structures. We investigate accuracy, cost, and execution time on multi-node CPU and GPU systems for moderate to large models using AMD MI250X (OLCF Crusher), NVIDIA A100 (NERSC Perlmutter), and V100 (LLNL Lassen and OLCF Summit), resulting in order of magnitude efficiency improvements over a broad range of model properties and scales. We discuss efficient matrix-free representation of Jacobians and demonstrate how automatic differentiation enables rapid development of nonlinear material models without impacting debuggability and workflows targeting GPUs

arXiv.org e-Print Archive

Accuracy, Scalability, and Efficiency of Mixed-Element USM3D for Benchmark Three-Dimensional Flows

Author: Diskin Boris
Frink Neal T.
Jespersen Dennis C.
Pandya Mohagna J.
Thomas James L.
Publication venue
Publication date
Field of study

The unstructured, mixed-element, cell-centered, finite-volume flow solver USM3D is enhanced with new capabilities including parallelization, line generation for general unstructured grids, improved discretization scheme, and optimized iterative solver. The paper reports on the new developments to the flow solver and assesses the accuracy, scalability, and efficiency. The USM3D assessments are conducted using a baseline method and the recent hierarchical adaptive nonlinear iteration method framework. Two benchmark turbulent flows, namely, a subsonic separated flow around a three-dimensional hemisphere-cylinder configuration and a transonic flow around the ONERA M6 wing are considered

NASA Technical Reports Server

Semiannual report, 1 October 1990 - 31 March 1991

Author
Publication venue
Publication date
Field of study

Research conducted at the Institute for Computer Applications in Science and Engineering in applied mathematics, numerical analysis, and computer science is summarized

NASA Technical Reports Server