100 research outputs found
A fully algebraic and robust two-level Schwarz method based on optimal local approximation spaces
Two-level domain decomposition preconditioners lead to fast convergence and
scalability of iterative solvers. However, for highly heterogeneous problems,
where the coefficient function is varying rapidly on several possibly
non-separated scales, the condition number of the preconditioned system
generally depends on the contrast of the coefficient function leading to a
deterioration of convergence. Enhancing the methods by coarse spaces
constructed from suitable local eigenvalue problems, also denoted as adaptive
or spectral coarse spaces, restores robust, contrast-independent convergence.
However, these eigenvalue problems typically rely on non-algebraic information,
such that the adaptive coarse spaces cannot be constructed from the fully
assembled system matrix. In this paper, a novel algebraic adaptive coarse
space, which relies on the a-orthogonal decomposition of (local) finite element
(FE) spaces into functions that solve the partial differential equation (PDE)
with some trace and FE functions that are zero on the boundary, is proposed. In
particular, the basis is constructed from eigenmodes of two types of local
eigenvalue problems associated with the edges of the domain decomposition. To
approximate functions that solve the PDE locally, we employ a transfer
eigenvalue problem, which has originally been proposed for the construction of
optimal local approximation spaces for multiscale methods. In addition, we make
use of a Dirichlet eigenvalue problem that is a slight modification of the
Neumann eigenvalue problem used in the adaptive generalized Dryja-Smith-Widlund
(AGDSW) coarse space. Both eigenvalue problems rely solely on local Dirichlet
matrices, which can be extracted from the fully assembled system matrix. By
combining arguments from multiscale and domain decomposition methods we derive
a contrast-independent upper bound for the condition number
An extension of the approximate component mode synthesis method to the heterogeneous Helmholtz equation
In this work we propose and analyze an extension of the approximate component
mode synthesis (ACMS) method to the heterogeneous Helmholtz equation. The ACMS
method has originally been introduced by Hetmaniuk and Lehoucq as a multiscale
method to solve elliptic partial differential equations. The ACMS method uses a
domain decomposition to separate the numerical approximation by splitting the
variational problem into two independent parts: local Helmholtz problems and a
global interface problem. While the former are naturally local and decoupled
such that they can be easily solved in parallel, the latter requires the
construction of suitable local basis functions relying on local eigenmodes and
suitable extensions. We carry out a full error analysis of this approach
focusing on the case where the domain decomposition is kept fixed, but the
number of eigenfunctions is increased. The theoretical results in this work are
supported by numerical experiments verifying algebraic convergence for the
method. In certain, practically relevant cases, even exponential convergence
for the local Helmholtz problems can be achieved without oversampling
Monolithic Overlapping Schwarz Domain Decomposition Methods with GDSW Coarse Spaces for Saddle Point Problems
Monolithic overlapping Schwarz preconditioners for saddle point problems of Stokes, Navier-Stokes, and mixed linear elasticity ty e are presented. For the first time, coarse spaces obtained from the GDSW (Generalized Dryja-Smith-Widlund) approach are used in such a setting. Numerical results of our parallel implementation are presented for several model problems. In particular, cases are considered where the problem cannot or should not b e reduced using local static condensation, e.g., Stokes, Navier-Stokes or mixed elasticity problems with continuous pressure spaces. In the new monolithic preconditioners, the local overlapping problems and the coarse problem are saddle point problems with the same structure as the original problem. Our parallel implementation of these preconditioners is based on the FROSch (Fast and Robust Overlapping Schwarz) library, which is part of the Trilinos package ShyLU. The implementation is algebraic in the sense that the preconditioners can be constructed from the fully assembled stiffness matrix and information about the block structure of the problem. Parallel scalability results for several thousand cores for Stokes, Navier-Stokes, and mixed linear elasticity model problems are reported. Each of the local problems is solved using a direct solver in serial mo de, whereas the coarse problem is solved using a direct solver in serial or MPI-parallel mode or using an MPI-parallel iterative Krylov solve
An Experimental Study of Two-Level Schwarz Domain Decomposition Preconditioners on GPUs
The generalized Dryja--Smith--Widlund (GDSW) preconditioner is a two-level
overlapping Schwarz domain decomposition (DD) preconditioner that couples a
classical one-level overlapping Schwarz preconditioner with an
energy-minimizing coarse space. When used to accelerate the convergence rate of
Krylov subspace iterative methods, the GDSW preconditioner provides robustness
and scalability for the solution of sparse linear systems arising from the
discretization of a wide range of partial different equations. In this paper,
we present FROSch (Fast and Robust Schwarz), a domain decomposition solver
package which implements GDSW-type preconditioners for both CPU and GPU
clusters. To improve the solver performance on GPUs, we use a novel
decomposition to run multiple MPI processes on each GPU, reducing both solver's
computational and storage costs and potentially improving the convergence rate.
This allowed us to obtain competitive or faster performance using GPUs compared
to using CPUs alone. We demonstrate the performance of FROSch on the Summit
supercomputer with NVIDIA V100 GPUs, where we used NVIDIA Multi-Process Service
(MPS) to implement our decomposition strategy.
The solver has a wide variety of algorithmic and implementation choices,
which poses both opportunities and challenges for its GPU implementation. We
conduct a thorough experimental study with different solver options including
the exact or inexact solution of the local overlapping subdomain problems on a
GPU. We also discuss the effect of using the iterative variant of the
incomplete LU factorization and sparse-triangular solve as the approximate
local solver, and using lower precision for computing the whole FROSch
preconditioner. Overall, the solve time was reduced by factors of about
using GPUs, while the GPU acceleration of the numerical setup time
depend on the solver options and the local matrix sizes.Comment: Accepted for publication in IPDPS'2
On temporal homogenization in the numerical simulation of atherosclerotic plaque growth
A temporal homogenization approach for the numerical simulation of
atherosclerotic plaque growth is extended to fully coupled fluid-structure
interaction (FSI) simulations. The numerical results indicate that the
two-scale approach yields significantly different results compared to a simple
heuristic averaging, where only stationary long-scale FSI problems are solved,
confirming the importance of incorporating stress variations on small
time-scales. In the homogenization approach, a periodic fine-scale problem,
which is periodic with respect to the heart beat, has to be solved for each
long-scale time step. Even if no exact initial conditions are available,
periodicity can be achieved within only 2-3 heart beats by simple
time-stepping
Surrogate Convolutional Neural Network Models for Steady Computational Fluid Dynamics Simulations
A convolution neural network (CNN)-based approach for the construction of reduced order surrogate models for computational fluid dynamics (CFD) simulations is introduced; it is inspired by the approach of Guo, Li, and Iori [X. Guo, W. Li, and F. Iorio, Convolutional neural networks for steady flow approximation, in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, New York, USA, 2016, ACM, pp. 481–490]. In particular, the neural networks are trained in order to predict images of the flow field in a channel with varying obstacle based on an image of the geometry of the channel. A classical CNN with bottleneck structure and a U-Net are compared while varying the input format, the number of decoder paths, as well as the loss function used to train the networks. This approach yields very low prediction errors, in particular, when using the U-Net architecture. Furthermore, the models are also able to generalize to unseen geometries of the same type. A transfer learning approach enables the model to be trained to a new type of geometries with very low training cost. Finally, based on this transfer learning approach, a sequential learning strategy is introduced, which significantly reduces the amount of necessary training data
FROSch Preconditioners for Land Ice Simulations of Greenland and Antarctica
Numerical simulations of Greenland and Antarctic ice sheets involve the solution of large-scale highly nonlinear systems of equations on complex shallow geometries. This work is concerned with the construction of Schwarz preconditioners for the solution of the associated tangent problems, which are challenging for solvers mainly because of the strong anisotropy of the meshes and wildly changing boundary conditions that can lead to poorly constrained problems on large portions of the domain. Here, two-level GDSW (Generalized Dryja–Smith–Widlund) type Schwarz preconditioners are applied to different land ice problems, i.e., a velocity problem, a temperature problem, as well as the coupling of the former two problems. We employ the MPI-parallel implementation of multi-level Schwarz preconditioners provided by the package FROSch (Fast and Robust Schwarz) from the Trilinos library. The strength of the proposed preconditioner is that it yields out-of-the-box scalable and robust preconditioners for the single physics problems.
To our knowledge, this is the first time two-level Schwarz preconditioners are applied to the ice sheet problem and a scalable preconditioner has been used for the coupled problem. The preconditioner for the coupled problem differs from previous monolithic GDSW preconditioners in the sense that decoupled extension operators are used to compute the values in the interior of the subdomains. Several approaches for improving the performance, such as reuse strategies and shared memory OpenMP parallelization, are explored as well.
In our numerical study we target both uniform meshes of varying resolution for the Antarctic ice sheet as well as non uniform meshes for the Greenland ice sheet are considered. We present several weak and strong scaling studies confirming the robustness of the approach and the parallel scalability of the FROSch implementation. Among the highlights of the numerical results are a weak scaling study for up to 32 K processor cores (8 K MPI-ranks and 4 OpenMP threads) and 566 M degrees of freedom for the velocity problem as well as a strong scaling study for up to 4 K processor cores (and MPI-ranks) and 68 M degrees of freedom for the coupled problem
A short note on solving partial differential equations using convolutional neural networks
The approach of using physics-based machine learning to solve PDEs has recently become very popular. A recent approach to solve PDEs based on CNNs uses finite difference stencils to include the residual of the partial differential equation into the loss function. In this work, the relation between the network training and the solution of a respective finite difference linear system of equations using classical numerical solvers is discussed. It turns out that many beneficial properties of the linear equation system are neglected in the network training. Finally, numerical results which underline the benefits of classical numerical solvers are presented
Parallel Overlapping Schwarz Preconditioners and Multiscale Discretizations with Applications to Fluid-Structure Interaction and Highly Heterogeneous Problems
Accurate simulations of transmural wall stresses in artherosclerotic coronary arteries may help to predict plaque rupture. Therefore, a robust and efficient numerical framework for Fluid-Structure Interaction (FSI) of the blood flow and the arterial wall has to be set up, and suitable material laws for the modeling of the fluid and the structural response have to be incorporated. In this thesis, monolithic coupling algorithms and corresponding monolithic preconditioners are used to simulate FSI using highly nonlinear anisotropic polyconvex hyperelastic and anisotropic viscoelastic material models for the arterial wall. An MPI-parallel FSI software from the LifeV library is coupled to the software FEAP in order to enable access to the structural material models implemented in FEAP. To define a benchmark test for highly nonlinear material models in FSI, a simple geometry corresponding to a section of an idealized coronary artery, suitable boundary conditions, and material parameters adapted to experimental data are used. In particular, the geometry is chosen to be nonsymmetric to make effects due to the anisotropy of the structure visible. An initialization phase and several heartbeats are simulated, and systematical studies with meshes of increasing refinement and different space discretizations are carried out. The results indicate that, for the highly nonlinear material models, piecewise quadratic or F-bar element discretizations lead to significantly better results than piecewise linear shape functions. The results using piecewise linear shape functions are less accurate with respect to the displacements and, in particular, to the approximation of the stresses. To improve the performance of the FSI simulations, a more robust preconditioner for the highly nonlinear structural material models has to be used. Therefore, a parallel implementation of the GDSW (Generalized Dryja-Smith-Widlund) preconditioner, which is a geometric two-level overlapping Schwarz preconditioner with energy-minimizing coarse space, is presented. The implementation, which is based on the software library Trilinos, is held flexible to make further extensions of the preconditioner possible. Even though the dimension of its coarse space is comparably large, parallel scalability for two and three dimensional scalar elliptic and linear elastic problems for thousands of cores is demonstrated. Also for unstructured domain decompositions and for a hybrid version of the preconditioner, convincing scalability is presented. When used as a preconditioner for the structure block in FSI simulations, the GDSW preconditioner shows excellent performance as well: scalability for up to 512 cores and a significant reduction of the simulation time and of the number of iterations with respect to the previously used preconditioner, IFPACK, are observed. IFPACK is an algebraic one-level overlapping Schwarz preconditioner. Finally, highly heterogeneous (multiscale) problems are investigated. Since the GDSW coarse space is not robust for general problems of this type, spaces based on Approximate Component Mode Synthesis (ACMS) are considered. On the basis of the ACMS space, coarse spaces for overlapping Schwarz methods are constructed, and a parallel implementation of a special finite element method is presented. For the coarse spaces, preliminary results indicating numerical scalability and robustness are discussed. For the parallel implementation of the special finite element method, very good parallel weak scalability is observed with respect to the construction of the basis functions and to the solution of the resulting linear system using the FETI-DP (Finite Element Tearing and Interconnecting - Dual Primal) method
Improving Pseudo-Time Stepping Convergence for CFD Simulations With Neural Networks
Computational fluid dynamics (CFD) simulations of viscous fluids described by
the Navier-Stokes equations are considered. Depending on the Reynolds number of
the flow, the Navier-Stokes equations may exhibit a highly nonlinear behavior.
The system of nonlinear equations resulting from the discretization of the
Navier-Stokes equations can be solved using nonlinear iteration methods, such
as Newton's method. However, fast quadratic convergence is typically only
obtained in a local neighborhood of the solution, and for many configurations,
the classical Newton iteration does not converge at all. In such cases,
so-called globalization techniques may help to improve convergence.
In this paper, pseudo-transient continuation is employed in order to improve
nonlinear convergence. The classical algorithm is enhanced by a neural network
model that is trained to predict a local pseudo-time step. Generalization of
the novel approach is facilitated by predicting the local pseudo-time step
separately on each element using only local information on a patch of adjacent
elements as input. Numerical results for standard benchmark problems, including
flow through a backward facing step geometry and Couette flow, show the
performance of the machine learning-enhanced globalization approach; as the
software for the simulations, the CFD module of COMSOL Multiphysics is
employed
- …