23 research outputs found
Recursive Algorithms for Distributed Forests of Octrees
The forest-of-octrees approach to parallel adaptive mesh refinement and
coarsening (AMR) has recently been demonstrated in the context of a number of
large-scale PDE-based applications. Although linear octrees, which store only
leaf octants, have an underlying tree structure by definition, it is not often
exploited in previously published mesh-related algorithms. This is because the
branches are not explicitly stored, and because the topological relationships
in meshes, such as the adjacency between cells, introduce dependencies that do
not respect the octree hierarchy. In this work we combine hierarchical and
topological relationships between octree branches to design efficient recursive
algorithms.
We present three important algorithms with recursive implementations. The
first is a parallel search for leaves matching any of a set of multiple search
criteria. The second is a ghost layer construction algorithm that handles
arbitrarily refined octrees that are not covered by previous algorithms, which
require a 2:1 condition between neighboring leaves. The third is a universal
mesh topology iterator. This iterator visits every cell in a domain partition,
as well as every interface (face, edge and corner) between these cells. The
iterator calculates the local topological information for every interface that
it visits, taking into account the nonconforming interfaces that increase the
complexity of describing the local topology. To demonstrate the utility of the
topology iterator, we use it to compute the numbering and encoding of
higher-order nodal basis functions.
We analyze the complexity of the new recursive algorithms theoretically, and
assess their performance, both in terms of single-processor efficiency and in
terms of parallel scalability, demonstrating good weak and strong scaling up to
458k cores of the JUQUEEN supercomputer.Comment: 35 pages, 15 figures, 3 table
Landau Collision Integral Solver with Adaptive Mesh Refinement on Emerging Architectures
The Landau collision integral is an accurate model for the small-angle
dominated Coulomb collisions in fusion plasmas. We investigate a high order
accurate, fully conservative, finite element discretization of the nonlinear
multi-species Landau integral with adaptive mesh refinement using the PETSc
library (www.mcs.anl.gov/petsc). We develop algorithms and techniques to
efficiently utilize emerging architectures with an approach that minimizes
memory usage and movement and is suitable for vector processing. The Landau
collision integral is vectorized with Intel AVX-512 intrinsics and the solver
sustains as much as 22% of the theoretical peak flop rate of the Second
Generation Intel Xeon Phi, Knights Landing, processor
An extreme-scale implicit solver for complex PDEs: highly heterogeneous flow in earth's mantle
Mantle convection is the fundamental physical process within earth's interior responsible for the thermal and geological evolution of the planet, including plate tectonics. The mantle is modeled as a viscous, incompressible, non-Newtonian fluid. The wide range of spatial scales, extreme variability and anisotropy in material properties, and severely nonlinear rheology have made global mantle convection modeling with realistic parameters prohibitive. Here we present a new implicit solver that exhibits optimal algorithmic performance and is capable of extreme scaling for hard PDE problems, such as mantle convection. To maximize accuracy and minimize runtime, the solver incorporates a number of advances, including aggressive multi-octree adaptivity, mixed continuous-discontinuous discretization, arbitrarily-high-order accuracy, hybrid spectral/geometric/algebraic multigrid, and novel Schur-complement preconditioning. These features present enormous challenges for extreme scalability. We demonstrate that---contrary to conventional wisdom---algorithmically optimal implicit solvers can be designed that scale out to 1.5 million cores for severely nonlinear, ill-conditioned, heterogeneous, and anisotropic PDEs
A scalable parallel finite element framework for growing geometries. Application to metal additive manufacturing
This work introduces an innovative parallel, fully-distributed finite element
framework for growing geometries and its application to metal additive
manufacturing. It is well-known that virtual part design and qualification in
additive manufacturing requires highly-accurate multiscale and multiphysics
analyses. Only high performance computing tools are able to handle such
complexity in time frames compatible with time-to-market. However, efficiency,
without loss of accuracy, has rarely held the centre stage in the numerical
community. Here, in contrast, the framework is designed to adequately exploit
the resources of high-end distributed-memory machines. It is grounded on three
building blocks: (1) Hierarchical adaptive mesh refinement with octree-based
meshes; (2) a parallel strategy to model the growth of the geometry; (3)
state-of-the-art parallel iterative linear solvers. Computational experiments
consider the heat transfer analysis at the part scale of the printing process
by powder-bed technologies. After verification against a 3D benchmark, a
strong-scaling analysis assesses performance and identifies major sources of
parallel overhead. A third numerical example examines the efficiency and
robustness of (2) in a curved 3D shape. Unprecedented parallelism and
scalability were achieved in this work. Hence, this framework contributes to
take on higher complexity and/or accuracy, not only of part-scale simulations
of metal or polymer additive manufacturing, but also in welding, sedimentation,
atherosclerosis, or any other physical problem where the physical domain of
interest grows in time
Scalable Recovery-based Adaptation on Quadtree Meshes for Advection-Diffusion-Reaction Problems
We propose a mesh adaptation procedure for Cartesian quadtree meshes, to
discretize scalar advection-diffusion-reaction problems. The adaptation process
is driven by a recovery-based a posteriori estimator for the -norm
of the discretization error, based on suitable higher order approximations of
both the solution and the associated gradient. In particular, a metric-based
approach exploits the information furnished by the estimator to iteratively
predict the new adapted mesh. The new mesh adaptation algorithm is successfully
assessed on different configurations, and turns out to perform well also when
dealing with discontinuities in the data as well as in the presence of internal
layers not aligned with the Cartesian directions. A cross-comparison with a
standard estimate--mark--refine approach and with other adaptive strategies
available in the literature shows the remarkable accuracy and parallel
scalability of the proposed approach