190 research outputs found
A Comparison of Two Shallow Water Models with Non-Conforming Adaptive Grids: classical tests
In an effort to study the applicability of adaptive mesh refinement (AMR)
techniques to atmospheric models an interpolation-based spectral element
shallow water model on a cubed-sphere grid is compared to a block-structured
finite volume method in latitude-longitude geometry. Both models utilize a
non-conforming adaptation approach which doubles the resolution at fine-coarse
mesh interfaces. The underlying AMR libraries are quad-tree based and ensure
that neighboring regions can only differ by one refinement level.
The models are compared via selected test cases from a standard test suite
for the shallow water equations. They include the advection of a cosine bell, a
steady-state geostrophic flow, a flow over an idealized mountain and a
Rossby-Haurwitz wave. Both static and dynamics adaptations are evaluated which
reveal the strengths and weaknesses of the AMR techniques. Overall, the AMR
simulations show that both models successfully place static and dynamic
adaptations in local regions without requiring a fine grid in the global
domain. The adaptive grids reliably track features of interests without visible
distortions or noise at mesh interfaces. Simple threshold adaptation criteria
for the geopotential height and the relative vorticity are assessed.Comment: 25 pages, 11 figures, preprin
Evaluation of an efficient etack-RLE clustering concept for dynamically adaptive grids
This is the author accepted manuscript. The final version is available from the Society for Industrial and Applied Mathematics via the DOI in this record.Abstract.
One approach to tackle the challenge of efficient implementations for parallel PDE simulations
on dynamically changing grids is the usage of space-filling curves (SFC). While SFC algorithms
possess advantageous properties such as low memory requirements and close-to-optimal partitioning
approaches with linear complexity, they require efficient communication strategies for keeping and
utilizing the connectivity information, in particular for dynamically changing grids. Our approach
is to use a sparse communication graph to store the connectivity information and to transfer data
block-wise. This permits efficient generation of multiple partitions per memory context (denoted
by clustering) which - in combination with a run-length encoding (RLE) - directly leads to elegant
solutions for shared, distributed and hybrid parallelization and allows cluster-based optimizations.
While previous work focused on specific aspects, we present in this paper an overall compact
summary of the stack-RLE clustering approach completed by aspects on the vertex-based communication
that ease up understanding the approach. The central contribution of this work is the proof
of suitability of the stack-RLE clustering approach for an efficient realization of different, relevant
building blocks of Scientific Computing methodology and real-life CSE applications: We show 95%
strong scalability for small-scale scalability benchmarks on 512 cores and weak scalability of over 90%
on 8192 cores for finite-volume solvers and changing grid structure in every time step; optimizations
of simulation data backends by writer tasks; comparisons of analytical benchmarks to analyze the
adaptivity criteria; and a Tsunami simulation as a representative real-world showcase of a wave propagation
for our approach which reduces the overall workload by 95% for parallel fully-adaptive mesh
refinement and, based on a comparison with SFC-ordered regular grid cells, reduces the computation
time by a factor of 7.6 with improved results and a factor of 62.2 with results of similar accuracy of
buoy station dataThis work was partly supported by the German Research
Foundation (DFG) as part of the Transregional Collaborative Research Centre “Invasive
Computing” (SFB/TR 89)
Constructing Reference Metrics on Multicube Representations of Arbitrary Manifolds
Reference metrics are used to define the differential structure on multicube
representations of manifolds, i.e., they provide a simple and practical way to
define what it means globally for tensor fields and their derivatives to be
continuous. This paper introduces a general procedure for constructing
reference metrics automatically on multicube representations of manifolds with
arbitrary topologies. The method is tested here by constructing reference
metrics for compact, orientable two-dimensional manifolds with genera between
zero and five. These metrics are shown to satisfy the Gauss-Bonnet identity
numerically to the level of truncation error (which converges toward zero as
the numerical resolution is increased). These reference metrics can be made
smoother and more uniform by evolving them with Ricci flow. This smoothing
procedure is tested on the two-dimensional reference metrics constructed here.
These smoothing evolutions (using volume-normalized Ricci flow with DeTurck
gauge fixing) are all shown to produce reference metrics with constant scalar
curvatures (at the level of numerical truncation error).Comment: 37 pages, 16 figures; additional introductory material added in
version accepted for publicatio
PERFORMANCE EVALUATION AND OPTIMIZATION OF THE UNSTRUCTURED CFD CODE UNCLE
Numerous advancements made in the field of computational sciences have made CFD a viable solution to the modern day fluid dynamics problems. Progress in computer performance allows us to solve a complex flow field in practical CPU time. Commodity clusters are also gaining popularity as computational research platform for various CFD communities. This research focuses on evaluating and enhancing the performance of an in-house, unstructured, 3D CFD code on modern commodity clusters. The fundamental idea is to tune the codes to optimize the cache behavior of the node on commodity clusters to achieve enhanced code performance. Accordingly, this work presents discussion of various available techniques for data access optimization and detailed description of those which yielded improved code performance. These techniques were tested on various steady, unsteady, laminar, and turbulent test cases and the results are presented. The critical hardware parameters which influenced the code performance were identified. A detailed study investigating the effect of these parameters on the code performance was conducted and the results are presented. The successful single node improvements were also efficiently tested on parallel platform. The modified version of the code was also ported to different hardware architectures with successful results. Loop blocking is established as a predictor of code performance
- …