Search CORE

100 research outputs found

A fully algebraic and robust two-level Schwarz method based on optimal local approximation spaces

Author: Heinlein Alexander
Smetana Kathrin
Publication venue
Publication date: 12/07/2022
Field of study

Two-level domain decomposition preconditioners lead to fast convergence and scalability of iterative solvers. However, for highly heterogeneous problems, where the coefficient function is varying rapidly on several possibly non-separated scales, the condition number of the preconditioned system generally depends on the contrast of the coefficient function leading to a deterioration of convergence. Enhancing the methods by coarse spaces constructed from suitable local eigenvalue problems, also denoted as adaptive or spectral coarse spaces, restores robust, contrast-independent convergence. However, these eigenvalue problems typically rely on non-algebraic information, such that the adaptive coarse spaces cannot be constructed from the fully assembled system matrix. In this paper, a novel algebraic adaptive coarse space, which relies on the a-orthogonal decomposition of (local) finite element (FE) spaces into functions that solve the partial differential equation (PDE) with some trace and FE functions that are zero on the boundary, is proposed. In particular, the basis is constructed from eigenmodes of two types of local eigenvalue problems associated with the edges of the domain decomposition. To approximate functions that solve the PDE locally, we employ a transfer eigenvalue problem, which has originally been proposed for the construction of optimal local approximation spaces for multiscale methods. In addition, we make use of a Dirichlet eigenvalue problem that is a slight modification of the Neumann eigenvalue problem used in the adaptive generalized Dryja-Smith-Widlund (AGDSW) coarse space. Both eigenvalue problems rely solely on local Dirichlet matrices, which can be extracted from the fully assembled system matrix. By combining arguments from multiscale and domain decomposition methods we derive a contrast-independent upper bound for the condition number

arXiv.org e-Print Archive

An extension of the approximate component mode synthesis method to the heterogeneous Helmholtz equation

Author: Giammatteo Elena
Heinlein Alexander
Schlottbom Matthias
Publication venue
Publication date: 26/09/2023
Field of study

In this work we propose and analyze an extension of the approximate component mode synthesis (ACMS) method to the heterogeneous Helmholtz equation. The ACMS method has originally been introduced by Hetmaniuk and Lehoucq as a multiscale method to solve elliptic partial differential equations. The ACMS method uses a domain decomposition to separate the numerical approximation by splitting the variational problem into two independent parts: local Helmholtz problems and a global interface problem. While the former are naturally local and decoupled such that they can be easily solved in parallel, the latter requires the construction of suitable local basis functions relying on local eigenmodes and suitable extensions. We carry out a full error analysis of this approach focusing on the case where the domain decomposition is kept fixed, but the number of eigenfunctions is increased. The theoretical results in this work are supported by numerical experiments verifying algebraic convergence for the method. In certain, practically relevant cases, even exponential convergence for the local Helmholtz problems can be achieved without oversampling

arXiv.org e-Print Archive

Monolithic Overlapping Schwarz Domain Decomposition Methods with GDSW Coarse Spaces for Saddle Point Problems

Author: Heinlein Alexander
Hochmuth Christian
Klawonn Axel
Publication venue
Publication date: 02/07/2018
Field of study

Monolithic overlapping Schwarz preconditioners for saddle point problems of Stokes, Navier-Stokes, and mixed linear elasticity ty e are presented. For the first time, coarse spaces obtained from the GDSW (Generalized Dryja-Smith-Widlund) approach are used in such a setting. Numerical results of our parallel implementation are presented for several model problems. In particular, cases are considered where the problem cannot or should not b e reduced using local static condensation, e.g., Stokes, Navier-Stokes or mixed elasticity problems with continuous pressure spaces. In the new monolithic preconditioners, the local overlapping problems and the coarse problem are saddle point problems with the same structure as the original problem. Our parallel implementation of these preconditioners is based on the FROSch (Fast and Robust Overlapping Schwarz) library, which is part of the Trilinos package ShyLU. The implementation is algebraic in the sense that the preconditioners can be constructed from the fully assembled stiffness matrix and information about the block structure of the problem. Parallel scalability results for several thousand cores for Stokes, Navier-Stokes, and mixed linear elasticity model problems are reported. Each of the local problems is solved using a direct solver in serial mo de, whereas the coarse problem is solved using a direct solver in serial or MPI-parallel mode or using an MPI-parallel iterative Krylov solve

Kölner UniversitätsPublikationsServer

An Experimental Study of Two-Level Schwarz Domain Decomposition Preconditioners on GPUs

Author: Heinlein Alexander
Rajamanickam Sivasankaran
Yamazaki Ichitaro
Publication venue
Publication date: 10/04/2023
Field of study

The generalized Dryja--Smith--Widlund (GDSW) preconditioner is a two-level overlapping Schwarz domain decomposition (DD) preconditioner that couples a classical one-level overlapping Schwarz preconditioner with an energy-minimizing coarse space. When used to accelerate the convergence rate of Krylov subspace iterative methods, the GDSW preconditioner provides robustness and scalability for the solution of sparse linear systems arising from the discretization of a wide range of partial different equations. In this paper, we present FROSch (Fast and Robust Schwarz), a domain decomposition solver package which implements GDSW-type preconditioners for both CPU and GPU clusters. To improve the solver performance on GPUs, we use a novel decomposition to run multiple MPI processes on each GPU, reducing both solver's computational and storage costs and potentially improving the convergence rate. This allowed us to obtain competitive or faster performance using GPUs compared to using CPUs alone. We demonstrate the performance of FROSch on the Summit supercomputer with NVIDIA V100 GPUs, where we used NVIDIA Multi-Process Service (MPS) to implement our decomposition strategy. The solver has a wide variety of algorithmic and implementation choices, which poses both opportunities and challenges for its GPU implementation. We conduct a thorough experimental study with different solver options including the exact or inexact solution of the local overlapping subdomain problems on a GPU. We also discuss the effect of using the iterative variant of the incomplete LU factorization and sparse-triangular solve as the approximate local solver, and using lower precision for computing the whole FROSch preconditioner. Overall, the solve time was reduced by factors of about

2\times

using GPUs, while the GPU acceleration of the numerical setup time depend on the solver options and the local matrix sizes.Comment: Accepted for publication in IPDPS'2

arXiv.org e-Print Archive

On temporal homogenization in the numerical simulation of atherosclerotic plaque growth

Author: Frei Stefan
Heinlein Alexander
Richter Thomas
Publication venue
Publication date: 01/01/2021
Field of study

A temporal homogenization approach for the numerical simulation of atherosclerotic plaque growth is extended to fully coupled fluid-structure interaction (FSI) simulations. The numerical results indicate that the two-scale approach yields significantly different results compared to a simple heuristic averaging, where only stationary long-scale FSI problems are solved, confirming the importance of incorporating stress variations on small time-scales. In the homogenization approach, a periodic fine-scale problem, which is periodic with respect to the heart beat, has to be solved for each long-scale time step. Even if no exact initial conditions are available, periodicity can be achieved within only 2-3 heart beats by simple time-stepping

arXiv.org e-Print Archive

KOPS - The Institutional Repository of the University of Konstanz

TU Delft Repository

Surrogate Convolutional Neural Network Models for Steady Computational Fluid Dynamics Simulations

Author: Eichinger Matthias
Heinlein Alexander
Klawonn Axel
Publication venue
Publication date: 14/12/2020
Field of study

A convolution neural network (CNN)-based approach for the construction of reduced order surrogate models for computational fluid dynamics (CFD) simulations is introduced; it is inspired by the approach of Guo, Li, and Iori [X. Guo, W. Li, and F. Iorio, Convolutional neural networks for steady flow approximation, in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, New York, USA, 2016, ACM, pp. 481–490]. In particular, the neural networks are trained in order to predict images of the flow field in a channel with varying obstacle based on an image of the geometry of the channel. A classical CNN with bottleneck structure and a U-Net are compared while varying the input format, the number of decoder paths, as well as the loss function used to train the networks. This approach yields very low prediction errors, in particular, when using the U-Net architecture. Furthermore, the models are also able to generalize to unseen geometries of the same type. A transfer learning approach enables the model to be trained to a new type of geometries with very low training cost. Finally, based on this transfer learning approach, a sequential learning strategy is introduced, which significantly reduces the amount of necessary training data

Kölner UniversitätsPublikationsServer

FROSch Preconditioners for Land Ice Simulations of Greenland and Antarctica

Author: Heinlein Alexander
Perego Mauro
Rajamanickam Sivasankaran
Publication venue
Publication date: 25/01/2021
Field of study

Numerical simulations of Greenland and Antarctic ice sheets involve the solution of large-scale highly nonlinear systems of equations on complex shallow geometries. This work is concerned with the construction of Schwarz preconditioners for the solution of the associated tangent problems, which are challenging for solvers mainly because of the strong anisotropy of the meshes and wildly changing boundary conditions that can lead to poorly constrained problems on large portions of the domain. Here, two-level GDSW (Generalized Dryja–Smith–Widlund) type Schwarz preconditioners are applied to different land ice problems, i.e., a velocity problem, a temperature problem, as well as the coupling of the former two problems. We employ the MPI-parallel implementation of multi-level Schwarz preconditioners provided by the package FROSch (Fast and Robust Schwarz) from the Trilinos library. The strength of the proposed preconditioner is that it yields out-of-the-box scalable and robust preconditioners for the single physics problems. To our knowledge, this is the first time two-level Schwarz preconditioners are applied to the ice sheet problem and a scalable preconditioner has been used for the coupled problem. The preconditioner for the coupled problem differs from previous monolithic GDSW preconditioners in the sense that decoupled extension operators are used to compute the values in the interior of the subdomains. Several approaches for improving the performance, such as reuse strategies and shared memory OpenMP parallelization, are explored as well. In our numerical study we target both uniform meshes of varying resolution for the Antarctic ice sheet as well as non uniform meshes for the Greenland ice sheet are considered. We present several weak and strong scaling studies confirming the robustness of the approach and the parallel scalability of the FROSch implementation. Among the highlights of the numerical results are a weak scaling study for up to 32 K processor cores (8 K MPI-ranks and 4 OpenMP threads) and 566 M degrees of freedom for the velocity problem as well as a strong scaling study for up to 4 K processor cores (and MPI-ranks) and 68 M degrees of freedom for the coupled problem

Kölner UniversitätsPublikationsServer

A short note on solving partial differential equations using convolutional neural networks

Author: Grimm Viktor
Heinlein Alexander
Klawonn Axel
Publication venue
Publication date: 29/11/2022
Field of study

The approach of using physics-based machine learning to solve PDEs has recently become very popular. A recent approach to solve PDEs based on CNNs uses finite difference stencils to include the residual of the partial differential equation into the loss function. In this work, the relation between the network training and the solution of a respective finite difference linear system of equations using classical numerical solvers is discussed. It turns out that many beneficial properties of the linear equation system are neglected in the network training. Finally, numerical results which underline the benefits of classical numerical solvers are presented

Kölner UniversitätsPublikationsServer

Parallel Overlapping Schwarz Preconditioners and Multiscale Discretizations with Applications to Fluid-Structure Interaction and Highly Heterogeneous Problems

Author: Heinlein Alexander
Publication venue
Publication date: 06/06/2016
Field of study

Accurate simulations of transmural wall stresses in artherosclerotic coronary arteries may help to predict plaque rupture. Therefore, a robust and efficient numerical framework for Fluid-Structure Interaction (FSI) of the blood flow and the arterial wall has to be set up, and suitable material laws for the modeling of the fluid and the structural response have to be incorporated. In this thesis, monolithic coupling algorithms and corresponding monolithic preconditioners are used to simulate FSI using highly nonlinear anisotropic polyconvex hyperelastic and anisotropic viscoelastic material models for the arterial wall. An MPI-parallel FSI software from the LifeV library is coupled to the software FEAP in order to enable access to the structural material models implemented in FEAP. To define a benchmark test for highly nonlinear material models in FSI, a simple geometry corresponding to a section of an idealized coronary artery, suitable boundary conditions, and material parameters adapted to experimental data are used. In particular, the geometry is chosen to be nonsymmetric to make effects due to the anisotropy of the structure visible. An initialization phase and several heartbeats are simulated, and systematical studies with meshes of increasing refinement and different space discretizations are carried out. The results indicate that, for the highly nonlinear material models, piecewise quadratic or F-bar element discretizations lead to significantly better results than piecewise linear shape functions. The results using piecewise linear shape functions are less accurate with respect to the displacements and, in particular, to the approximation of the stresses. To improve the performance of the FSI simulations, a more robust preconditioner for the highly nonlinear structural material models has to be used. Therefore, a parallel implementation of the GDSW (Generalized Dryja-Smith-Widlund) preconditioner, which is a geometric two-level overlapping Schwarz preconditioner with energy-minimizing coarse space, is presented. The implementation, which is based on the software library Trilinos, is held flexible to make further extensions of the preconditioner possible. Even though the dimension of its coarse space is comparably large, parallel scalability for two and three dimensional scalar elliptic and linear elastic problems for thousands of cores is demonstrated. Also for unstructured domain decompositions and for a hybrid version of the preconditioner, convincing scalability is presented. When used as a preconditioner for the structure block in FSI simulations, the GDSW preconditioner shows excellent performance as well: scalability for up to 512 cores and a significant reduction of the simulation time and of the number of iterations with respect to the previously used preconditioner, IFPACK, are observed. IFPACK is an algebraic one-level overlapping Schwarz preconditioner. Finally, highly heterogeneous (multiscale) problems are investigated. Since the GDSW coarse space is not robust for general problems of this type, spaces based on Approximate Component Mode Synthesis (ACMS) are considered. On the basis of the ACMS space, coarse spaces for overlapping Schwarz methods are constructed, and a parallel implementation of a special finite element method is presented. For the coarse spaces, preliminary results indicating numerical scalability and robustness are discussed. For the parallel implementation of the special finite element method, very good parallel weak scalability is observed with respect to the construction of the basis functions and to the solution of the resulting linear system using the FETI-DP (Finite Element Tearing and Interconnecting - Dual Primal) method

Kölner UniversitätsPublikationsServer

Improving Pseudo-Time Stepping Convergence for CFD Simulations With Neural Networks

Author: Heinlein Alexander
van Noorden Tycho
Zandbergen Anouk
Publication venue
Publication date: 10/10/2023
Field of study

Computational fluid dynamics (CFD) simulations of viscous fluids described by the Navier-Stokes equations are considered. Depending on the Reynolds number of the flow, the Navier-Stokes equations may exhibit a highly nonlinear behavior. The system of nonlinear equations resulting from the discretization of the Navier-Stokes equations can be solved using nonlinear iteration methods, such as Newton's method. However, fast quadratic convergence is typically only obtained in a local neighborhood of the solution, and for many configurations, the classical Newton iteration does not converge at all. In such cases, so-called globalization techniques may help to improve convergence. In this paper, pseudo-transient continuation is employed in order to improve nonlinear convergence. The classical algorithm is enhanced by a neural network model that is trained to predict a local pseudo-time step. Generalization of the novel approach is facilitated by predicting the local pseudo-time step separately on each element using only local information on a patch of adjacent elements as input. Numerical results for standard benchmark problems, including flow through a backward facing step geometry and Couette flow, show the performance of the machine learning-enhanced globalization approach; as the software for the simulations, the CFD module of COMSOL Multiphysics is employed

arXiv.org e-Print Archive