466 research outputs found
Multilevel Scalable Solvers for Stochastic Linear and Nonlinear Problems
This article discusses the uncertainty quantification (UQ) for
time-independent linear and nonlinear partial differential equation (PDE)-based
systems with random model parameters carried out using sampling-free intrusive
stochastic Galerkin method leveraging multilevel scalable solvers constructed
combining two-grid Schwarz method and AMG. High-resolution spatial meshes along
with a large number of stochastic expansion terms increase the system size
leading to significant memory consumption and computational costs. Domain
decomposition (DD)-based parallel scalable solvers are developed to this end
for linear and nonlinear stochastic PDEs. A generalized minimum residual
(GMRES) iterative solver equipped with a multilevel preconditioner consisting
of restricted additive Schwarz (RAS) for the fine grid and algebraic multigrid
(AMG) for the coarse grid is constructed to improve scalability. Numerical
experiments illustrate the scalabilities of the proposed solver for stochastic
linear and nonlinear Poisson problems
End-to-end GPU acceleration of low-order-refined preconditioning for high-order finite element discretizations
In this paper, we present algorithms and implementations for the end-to-end
GPU acceleration of matrix-free low-order-refined preconditioning of high-order
finite element problems. The methods described here allow for the construction
of effective preconditioners for high-order problems with optimal memory usage
and computational complexity. The preconditioners are based on the construction
of a spectrally equivalent low-order discretization on a refined mesh, which is
then amenable to, for example, algebraic multigrid preconditioning. The
constants of equivalence are independent of mesh size and polynomial degree.
For vector finite element problems in and (e.g.
for electromagnetic or radiation diffusion problems) a specially constructed
interpolation-histopolation basis is used to ensure fast convergence. Detailed
performance studies are carried out to analyze the efficiency of the GPU
algorithms. The kernel throughput of each of the main algorithmic components is
measured, and the strong and weak parallel scalability of the methods is
demonstrated. The different relative weighting and significance of the
algorithmic components on GPUs and CPUs is discussed. Results on problems
involving adaptively refined nonconforming meshes are shown, and the use of the
preconditioners on a large-scale magnetic diffusion problem using all spaces of
the finite element de Rham complex is illustrated.Comment: 23 pages, 13 figure
Optimal-complexity and robust multigrid methods for high-order FEM
The numerical solution of elliptic PDEs is often the most computationally intensive task in large-scale continuum mechanics simulations. High-order finite element methods can efficiently exploit modern parallel hardware while offering very rapid convergence properties. As the polynomial degree is increased, the efficient solution of such PDEs becomes difficult.
This thesis develops preconditioners for high-order discretizations. We build upon the pioneering work of Pavarino, who proved in 1993 that the additive Schwarz method with vertex patches and a low-order coarse space gives a solver for symmetric and coercive problems that is robust to the polynomial degree. However, for very high polynomial degrees it is not feasible to assemble or factorize the matrices for each vertex patch, as the patch matrices contain dense blocks, which couple together all degrees of freedom within a cell. The central novelty of the preconditioners we develop is that they have optimal time and space complexity on unstructured meshes of tensor-product cells.
Our solver relies on new finite elements for the de Rham complex that enable the blocks in the stiffness matrix corresponding to the cell interiors to become diagonal for scalar PDEs or block diagonal for vector-valued PDEs. With these new elements, the patch problems are as sparse as a low-order finite difference discretization, while having a sparser Cholesky factorization. In the non-separable case, the method can be applied as a preconditioner by approximating the problem with a separable surrogate. Through the careful use of incomplete factorizations and choice of space decomposition we achieve optimal fill-in in the patch factors, ultimately allowing for optimal-complexity storage and computational cost across the setup and solution stages.
We demonstrate the approach by solving a variety of symmetric and coercive problems, including the Poisson equation, the Riesz maps of H(curl) and H(div), and a H(div)-conforming interior penalty discretization of linear elasticity in three dimensions at p = 15
- …