246 research outputs found
Non-invasive multigrid for semi-structured grids
Multigrid solvers for hierarchical hybrid grids (HHG) have been proposed to
promote the efficient utilization of high performance computer architectures.
These HHG meshes are constructed by uniformly refining a relatively coarse
fully unstructured mesh. While HHG meshes provide some flexibility for
unstructured applications, most multigrid calculations can be accomplished
using efficient structured grid ideas and kernels. This paper focuses on
generalizing the HHG idea so that it is applicable to a broader community of
computational scientists, and so that it is easier for existing applications to
leverage structured multigrid components. Specifically, we adapt the structured
multigrid methodology to significantly more complex semi-structured meshes.
Further, we illustrate how mature applications might adopt a semi-structured
solver in a relatively non-invasive fashion. To do this, we propose a formal
mathematical framework for describing the semi-structured solver. This
formalism allows us to precisely define the associated multigrid method and to
show its relationship to a more traditional multigrid solver. Additionally, the
mathematical framework clarifies the associated software design and
implementation. Numerical experiments highlight the relationship of the new
solver with classical multigrid. We also demonstrate the generality and
potential performance gains associated with this type of semi-structured
multigrid
A Parallel Solver for Graph Laplacians
Problems from graph drawing, spectral clustering, network flow and graph
partitioning can all be expressed in terms of graph Laplacian matrices. There
are a variety of practical approaches to solving these problems in serial.
However, as problem sizes increase and single core speeds stagnate, parallelism
is essential to solve such problems quickly. We present an unsmoothed
aggregation multigrid method for solving graph Laplacians in a distributed
memory setting. We introduce new parallel aggregation and low degree
elimination algorithms targeted specifically at irregular degree graphs. These
algorithms are expressed in terms of sparse matrix-vector products using
generalized sum and product operations. This formulation is amenable to linear
algebra using arbitrary distributions and allows us to operate on a 2D sparse
matrix distribution, which is necessary for parallel scalability. Our solver
outperforms the natural parallel extension of the current state of the art in
an algorithmic comparison. We demonstrate scalability to 576 processes and
graphs with up to 1.7 billion edges.Comment: PASC '18, Code: https://github.com/ligmg/ligm
A matrix-free ILU realization based on surrogates
Matrix-free techniques play an increasingly important role in large-scale
simulations. Schur complement techniques and massively parallel multigrid
solvers for second-order elliptic partial differential equations can
significantly benefit from reduced memory traffic and consumption. The
matrix-free approach often restricts solver components to purely local
operations, for instance, the Jacobi- or Gauss--Seidel-Smoothers in multigrid
methods. An incomplete LU (ILU) decomposition cannot be calculated from local
information and is therefore not amenable to an on-the-fly computation which is
typically needed for matrix-free calculations. It generally requires the
storage and factorization of a sparse matrix which contradicts the low memory
requirements in large scale scenarios. In this work, we propose a matrix-free
ILU realization. More precisely, we introduce a memory-efficient, matrix-free
ILU(0)-Smoother component for low-order conforming finite elements on
tetrahedral hybrid grids. Hybrid grids consist of an unstructured macro-mesh
which is subdivided into a structured micro-mesh. The ILU(0) is used for
degrees-of-freedom assigned to the interior of macro-tetrahedra. This
ILU(0)-Smoother can be used for the efficient matrix-free evaluation of the
Steklov-Poincare operator from domain-decomposition methods. After introducing
and formally defining our smoother, we investigate its performance on refined
macro-tetrahedra. Secondly, the ILU(0)-Smoother on the macro-tetrahedrons is
implemented via surrogate matrix polynomials in conjunction with a fast
on-the-fly evaluation scheme resulting in an efficient matrix-free algorithm.
The polynomial coefficients are obtained by solving a least-squares problem on
a small part of the factorized ILU(0) matrices to stay memory efficient. The
convergence rates of this smoother with respect to the polynomial order are
thoroughly studied
Parallel Unsmoothed Aggregation Algebraic Multigrid Algorithms on GPUs
We design and implement a parallel algebraic multigrid method for isotropic
graph Laplacian problems on multicore Graphical Processing Units (GPUs). The
proposed AMG method is based on the aggregation framework. The setup phase of
the algorithm uses a parallel maximal independent set algorithm in forming
aggregates and the resulting coarse level hierarchy is then used in a K-cycle
iteration solve phase with a -Jacobi smoother. Numerical tests of a
parallel implementation of the method for graphics processors are presented to
demonstrate its effectiveness.Comment: 18 pages, 3 figure
Afivo: a framework for quadtree/octree AMR with shared-memory parallelization and geometric multigrid methods
Afivo is a framework for simulations with adaptive mesh refinement (AMR) on
quadtree (2D) and octree (3D) grids. The framework comes with a geometric
multigrid solver, shared-memory (OpenMP) parallelism and it supports output in
Silo and VTK file formats. Afivo can be used to efficiently simulate AMR
problems with up to about unknowns on desktops, workstations or single
compute nodes. For larger problems, existing distributed-memory frameworks are
better suited. The framework has no built-in functionality for specific physics
applications, so users have to implement their own numerical methods. The
included multigrid solver can be used to efficiently solve elliptic partial
differential equations such as Poisson's equation. Afivo's design was kept
simple, which in combination with the shared-memory parallelism facilitates
modification and experimentation with AMR algorithms. The framework was already
used to perform 3D simulations of streamer discharges, which required tens of
millions of cells
- …