4,272 research outputs found
Task-based adaptive multiresolution for time-space multi-scale reaction-diffusion systems on multi-core architectures
A new solver featuring time-space adaptation and error control has been
recently introduced to tackle the numerical solution of stiff
reaction-diffusion systems. Based on operator splitting, finite volume adaptive
multiresolution and high order time integrators with specific stability
properties for each operator, this strategy yields high computational
efficiency for large multidimensional computations on standard architectures
such as powerful workstations. However, the data structure of the original
implementation, based on trees of pointers, provides limited opportunities for
efficiency enhancements, while posing serious challenges in terms of parallel
programming and load balancing. The present contribution proposes a new
implementation of the whole set of numerical methods including Radau5 and
ROCK4, relying on a fully different data structure together with the use of a
specific library, TBB, for shared-memory, task-based parallelism with
work-stealing. The performance of our implementation is assessed in a series of
test-cases of increasing difficulty in two and three dimensions on multi-core
and many-core architectures, demonstrating high scalability
Construction and Application of an AMR Algorithm for Distributed Memory Computers
While the parallelization of blockstructured adaptive mesh refinement techniques is relatively straight-forward on shared memory architectures, appropriate distribution strategies for the emerging generation of distributed
memory machines are a topic of on-going research. In this paper, a locality-preserving domain decomposition is proposed that partitions the entire AMR hierarchy from the base level on. It is shown that the approach reduces the
communication costs and simplifies the implementation. Emphasis is put on the effective parallelization of the flux correction procedure at coarse-fine boundaries, which is indispensable for conservative finite volume schemes. An
easily reproducible standard benchmark and a highly resolved parallel AMR
simulation of a diffracting hydrogen-oxygen detonation demonstrate the proposed
strategy in practice
Adaptive Mesh Refinement for Singular Current Sheets in Incompressible Magnetohydrodynamic Flows
The formation of current sheets in ideal incompressible magnetohydrodynamic
flows in two dimensions is studied numerically using the technique of adaptive
mesh refinement. The growth of current density is in agreement with simple
scaling assumptions. As expected, adaptive mesh refinement shows to be very
efficient for studying singular structures compared to non-adaptive treatments.Comment: 8 pages RevTeX, 13 Postscript figure
A Parallel Mesh-Adaptive Framework for Hyperbolic Conservation Laws
We report on the development of a computational framework for the parallel,
mesh-adaptive solution of systems of hyperbolic conservation laws like the
time-dependent Euler equations in compressible gas dynamics or
Magneto-Hydrodynamics (MHD) and similar models in plasma physics. Local mesh
refinement is realized by the recursive bisection of grid blocks along each
spatial dimension, implemented numerical schemes include standard
finite-differences as well as shock-capturing central schemes, both in
connection with Runge-Kutta type integrators. Parallel execution is achieved
through a configurable hybrid of POSIX-multi-threading and MPI-distribution
with dynamic load balancing. One- two- and three-dimensional test computations
for the Euler equations have been carried out and show good parallel scaling
behavior. The Racoon framework is currently used to study the formation of
singularities in plasmas and fluids.Comment: late submissio
Recursive Algorithms for Distributed Forests of Octrees
The forest-of-octrees approach to parallel adaptive mesh refinement and
coarsening (AMR) has recently been demonstrated in the context of a number of
large-scale PDE-based applications. Although linear octrees, which store only
leaf octants, have an underlying tree structure by definition, it is not often
exploited in previously published mesh-related algorithms. This is because the
branches are not explicitly stored, and because the topological relationships
in meshes, such as the adjacency between cells, introduce dependencies that do
not respect the octree hierarchy. In this work we combine hierarchical and
topological relationships between octree branches to design efficient recursive
algorithms.
We present three important algorithms with recursive implementations. The
first is a parallel search for leaves matching any of a set of multiple search
criteria. The second is a ghost layer construction algorithm that handles
arbitrarily refined octrees that are not covered by previous algorithms, which
require a 2:1 condition between neighboring leaves. The third is a universal
mesh topology iterator. This iterator visits every cell in a domain partition,
as well as every interface (face, edge and corner) between these cells. The
iterator calculates the local topological information for every interface that
it visits, taking into account the nonconforming interfaces that increase the
complexity of describing the local topology. To demonstrate the utility of the
topology iterator, we use it to compute the numbering and encoding of
higher-order nodal basis functions.
We analyze the complexity of the new recursive algorithms theoretically, and
assess their performance, both in terms of single-processor efficiency and in
terms of parallel scalability, demonstrating good weak and strong scaling up to
458k cores of the JUQUEEN supercomputer.Comment: 35 pages, 15 figures, 3 table
A scalable parallel finite element framework for growing geometries. Application to metal additive manufacturing
This work introduces an innovative parallel, fully-distributed finite element
framework for growing geometries and its application to metal additive
manufacturing. It is well-known that virtual part design and qualification in
additive manufacturing requires highly-accurate multiscale and multiphysics
analyses. Only high performance computing tools are able to handle such
complexity in time frames compatible with time-to-market. However, efficiency,
without loss of accuracy, has rarely held the centre stage in the numerical
community. Here, in contrast, the framework is designed to adequately exploit
the resources of high-end distributed-memory machines. It is grounded on three
building blocks: (1) Hierarchical adaptive mesh refinement with octree-based
meshes; (2) a parallel strategy to model the growth of the geometry; (3)
state-of-the-art parallel iterative linear solvers. Computational experiments
consider the heat transfer analysis at the part scale of the printing process
by powder-bed technologies. After verification against a 3D benchmark, a
strong-scaling analysis assesses performance and identifies major sources of
parallel overhead. A third numerical example examines the efficiency and
robustness of (2) in a curved 3D shape. Unprecedented parallelism and
scalability were achieved in this work. Hence, this framework contributes to
take on higher complexity and/or accuracy, not only of part-scale simulations
of metal or polymer additive manufacturing, but also in welding, sedimentation,
atherosclerosis, or any other physical problem where the physical domain of
interest grows in time
Recent Advances in Graph Partitioning
We survey recent trends in practical algorithms for balanced graph
partitioning together with applications and future research directions
- …