316 research outputs found
A Domain-Decomposed Multilevel Method for Adaptively Refined Cartesian Grids with Embedded Boundaries
Preliminary verification and validation of an efficient Euler solver for adaptively refined Cartesian meshes with embedded boundaries is presented. The parallel, multilevel method makes use of a new on-the-fly parallel domain decomposition strategy based upon the use of space-filling curves, and automatically generates a sequence of coarse meshes for processing by the multigrid smoother. The coarse mesh generation algorithm produces grids which completely cover the computational domain at every level in the mesh hierarchy. A series of examples on realistically complex three-dimensional configurations demonstrate that this new coarsening algorithm reliably achieves mesh coarsening ratios in excess of 7 on adaptively refined meshes. Numerical investigations of the scheme's local truncation error demonstrate an achieved order of accuracy between 1.82 and 1.88. Convergence results for the multigrid scheme are presented for both subsonic and transonic test cases and demonstrate W-cycle multigrid convergence rates between 0.84 and 0.94. Preliminary parallel scalability tests on both simple wing and complex complete aircraft geometries shows a computational speedup of 52 on 64 processors using the run-time mesh partitioner
ColDICE: a parallel Vlasov-Poisson solver using moving adaptive simplicial tessellation
Resolving numerically Vlasov-Poisson equations for initially cold systems can
be reduced to following the evolution of a three-dimensional sheet evolving in
six-dimensional phase-space. We describe a public parallel numerical algorithm
consisting in representing the phase-space sheet with a conforming,
self-adaptive simplicial tessellation of which the vertices follow the
Lagrangian equations of motion. The algorithm is implemented both in six- and
four-dimensional phase-space. Refinement of the tessellation mesh is performed
using the bisection method and a local representation of the phase-space sheet
at second order relying on additional tracers created when needed at runtime.
In order to preserve in the best way the Hamiltonian nature of the system,
refinement is anisotropic and constrained by measurements of local Poincar\'e
invariants. Resolution of Poisson equation is performed using the fast Fourier
method on a regular rectangular grid, similarly to particle in cells codes. To
compute the density projected onto this grid, the intersection of the
tessellation and the grid is calculated using the method of Franklin and
Kankanhalli (1993) generalised to linear order. As preliminary tests of the
code, we study in four dimensional phase-space the evolution of an initially
small patch in a chaotic potential and the cosmological collapse of a
fluctuation composed of two sinusoidal waves. We also perform a "warm" dark
matter simulation in six-dimensional phase-space that we use to check the
parallel scaling of the code.Comment: Code and illustration movies available at:
http://www.vlasix.org/index.php?n=Main.ColDICE - Article submitted to Journal
of Computational Physic
Schnelle Löser für partielle Differentialgleichungen
The workshop Schnelle Löser für partielle Differentialgleichungen, organised by Randolph E. Bank (La Jolla), Wolfgang Hackbusch(Leipzig), Gabriel Wittum (Heidelberg) was held May 22nd - May 28th, 2005. This meeting was well attended by 47 participants with broad geographic representation from 9 countries and 3 continents. This workshop was a nice blend of researchers with various backgrounds
Survey of semi-regular multiresolution models for interactive terrain rendering
Rendering high quality digital terrains at interactive rates requires carefully crafted algorithms and data structures able to balance the competing requirements of realism and frame rates, while taking into account the memory and speed limitations of the underlying graphics platform. In this survey, we analyze multiresolution approaches that exploit a certain semi-regularity of the data. These approaches have produced some of the most efficient systems to date. After providing a short background and motivation for the methods, we focus on illustrating models based on tiled blocks and nested regular grids, quadtrees and triangle bin-trees triangulations, as well as cluster-based approaches. We then discuss LOD error metrics and system-level data management aspects of interactive terrain visualization, including dynamic scene management, out-of-core data organization and compression, as well as numerical accurac
Complex additive geometric multilevel solvers for Helmholtz equations on spacetrees
We introduce a family of implementations of low-order, additive, geometric multilevel solvers for systems of Helmholtz equations arising from Schrödinger equations. Both grid spacing and arithmetics may comprise complex numbers, and we thus can apply complex scaling to the indefinite Helmholtz operator. Our implementations are based on the notion of a spacetree and work exclusively with a finite number of precomputed local element matrices. They are globally matrix-free. Combining various relaxation factors with two grid transfer operators allows us to switch from additive multigrid over a hierarchical basis method into a Bramble-Pasciak-Xu (BPX)-type solver, with several multiscale smoothing variants within one code base. Pipelining allows us to realize full approximation storage (FAS) within the additive environment where, amortized, each grid vertex carrying degrees of freedom is read/written only once per iteration. The codes realize a single-touch policy. Among the features facilitated by matrix-free FAS is arbitrary dynamic mesh refinement (AMR) for all solver variants. AMR as an enabler for full multigrid (FMG) cycling—the grid unfolds throughout the computation—allows us to reduce the cost per unknown. The present work primary contributes toward software realization and design questions. Our experiments show that the consolidation of single-touch FAS, dynamic AMR, and vectorization-friendly, complex scaled, matrix-free FMG cycles delivers a mature implementation blueprint for solvers of Helmholtz equations in general. For this blueprint, we put particular emphasis on a strict implementation formalism as well as some implementation correctness proofs
Asynchronous Stabilisation and Assembly Techniques for Additive Multigrid
Multigrid solvers are among the best solvers in the world, but once
applied in the real world there are issues they must overcome. Many multigrid
phases exhibit low concurrency. Mesh and matrix assembly are challenging to
parallelise and introduce algorithmic latency. Dynamically adaptive codes exacerbate
these issues. Multigrid codes require the computation of a cascade of matrices and
dynamic adaptivity means these matrices are recomputed throughout the solve.
Existing methods to compute the matrices are expensive and delay the solve. Non-
trivial material parameters further increase the cost of accurate equation integration.
We propose to assemble all matrix equations as stencils in a delayed element-wise
fashion. Early multigrid iterations use cheap geometric approximations and more
accurate updated stencil integrations are computed in parallel with the multigrid
cycles. New stencil integrations are evaluated lazily and asynchronously fed to the
solver once they become available. They do not delay multigrid iterations. We
deploy stencil integrations as parallel tasks that are picked up by cores that would
otherwise be idle. Coarse grid solves in multiplicative multigrid also exhibit limited
concurrency. Small coarse mesh sizes correspond to small computational workload
and require costly synchronisation steps. This acts as a bottleneck and delays
solver iterations. Additive multigrid avoids this restriction, but becomes unstable
for non-trivial material parameters as additive coarse grid levels tend to overcorrect.
This leads to oscillations. We propose a new additive variant, adAFAC-x, with a
stabilisation parameter that damps coarse grid corrections to remove oscillations.
Per-level we solve an additional equation that produces an auxiliary correction.
The auxiliary correction can be computed additively to the rest of the solve and
uses ideas similar to smoothed aggregation multigrid to anticipate overcorrections.
Pipelining techniques allow adAFAC-x to be written using single-touch semantics
on a dynamically adaptive mesh
Doctor of Philosophy
dissertationThe increase in computational power of supercomputers is enabling complex scientific phenomena to be simulated at ever-increasing resolution and fidelity. With these simulations routinely producing large volumes of data, performing efficient I/O at this scale has become a very difficult task. Large-scale parallel writes are challenging due to the complex interdependencies between I/O middleware and hardware. Analytic-appropriate reads are traditionally hindered by bottlenecks in I/O access. Moreover, the two components of I/O, data generation from simulations (writes) and data exploration for analysis and visualization (reads), have substantially different data access requirements. Parallel writes, performed on supercomputers, often deploy aggregation strategies to permit large-sized contiguous access. Analysis and visualization tasks, usually performed on computationally modest resources, require fast access to localized subsets or multiresolution representations of the data. This dissertation tackles the problem of parallel I/O while bridging the gap between large-scale writes and analytics-appropriate reads. The focus of this work is to develop an end-to-end adaptive-resolution data movement framework that provides efficient I/O, while supporting the full spectrum of modern HPC hardware. This is achieved by developing technology for highly scalable and tunable parallel I/O, applicable to both traditional parallel data formats and multiresolution data formats, which are directly appropriate for analysis and visualization. To demonstrate the efficacy of the approach, a novel library (PIDX) is developed that is highly tunable and capable of adaptive-resolution parallel I/O to a multiresolution data format. Adaptive resolution storage and I/O, which allows subsets of a simulation to be accessed at varying spatial resolutions, can yield significant improvements to both the storage performance and I/O time. The library provides a set of parameters that controls the storage format and the nature of data aggregation across he network; further, a machine learning-based model is constructed that tunes these parameters for the maximum throughput. This work is empirically demonstrated by showing parallel I/O scaling up to 768K cores within a framework flexible enough to handle adaptive resolution I/O
- …