8,021 research outputs found
AMR on the CM-2
We describe the development of a structured adaptive mesh algorithm (AMR) for the Connection Machine-2 (CM-2). We develop a data layout scheme that preserves locality even for communication between fine and coarse grids. On 8K of a 32K machine we achieve performance slightly less than 1 CPU of the Cray Y-MP. We apply our algorithm to an inviscid compressible flow problem
Performance and Optimization Abstractions for Large Scale Heterogeneous Systems in the Cactus/Chemora Framework
We describe a set of lower-level abstractions to improve performance on
modern large scale heterogeneous systems. These provide portable access to
system- and hardware-dependent features, automatically apply dynamic
optimizations at run time, and target stencil-based codes used in finite
differencing, finite volume, or block-structured adaptive mesh refinement
codes.
These abstractions include a novel data structure to manage refinement
information for block-structured adaptive mesh refinement, an iterator
mechanism to efficiently traverse multi-dimensional arrays in stencil-based
codes, and a portable API and implementation for explicit SIMD vectorization.
These abstractions can either be employed manually, or be targeted by
automated code generation, or be used via support libraries by compilers during
code generation. The implementations described below are available in the
Cactus framework, and are used e.g. in the Einstein Toolkit for relativistic
astrophysics simulations
Three-dimensional adaptive evolution of gravitational waves in numerical relativity
Adaptive techniques are crucial for successful numerical modeling of
gravitational waves from astrophysical sources such as coalescing compact
binaries, since the radiation typically has wavelengths much larger than the
scale of the sources. We have carried out an important step toward this goal,
the evolution of weak gravitational waves using adaptive mesh refinement in the
Einstein equations. The 2-level adaptive simulation is compared with unigrid
runs at coarse and fine resolution, and is shown to track closely the features
of the fine grid run.Comment: REVTeX, 7 pages, including three figures; submitted to Physical
Review
Relativistic MHD with Adaptive Mesh Refinement
This paper presents a new computer code to solve the general relativistic
magnetohydrodynamics (GRMHD) equations using distributed parallel adaptive mesh
refinement (AMR). The fluid equations are solved using a finite difference
Convex ENO method (CENO) in 3+1 dimensions, and the AMR is Berger-Oliger.
Hyperbolic divergence cleaning is used to control the
constraint. We present results from three flat space tests, and examine the
accretion of a fluid onto a Schwarzschild black hole, reproducing the Michel
solution. The AMR simulations substantially improve performance while
reproducing the resolution equivalent unigrid simulation results. Finally, we
discuss strong scaling results for parallel unigrid and AMR runs.Comment: 24 pages, 14 figures, 3 table
Model Order Reduction for Rotating Electrical Machines
The simulation of electric rotating machines is both computationally
expensive and memory intensive. To overcome these costs, model order reduction
techniques can be applied. The focus of this contribution is especially on
machines that contain non-symmetric components. These are usually introduced
during the mass production process and are modeled by small perturbations in
the geometry (e.g., eccentricity) or the material parameters. While model order
reduction for symmetric machines is clear and does not need special treatment,
the non-symmetric setting adds additional challenges. An adaptive strategy
based on proper orthogonal decomposition is developed to overcome these
difficulties. Equipped with an a posteriori error estimator the obtained
solution is certified. Numerical examples are presented to demonstrate the
effectiveness of the proposed method
Adaptive computational methods for aerothermal heating analysis
The development of adaptive gridding techniques for finite-element analysis of fluid dynamics equations is described. The developmental work was done with the Euler equations with concentration on shock and inviscid flow field capturing. Ultimately this methodology is to be applied to a viscous analysis for the purpose of predicting accurate aerothermal loads on complex shapes subjected to high speed flow environments. The development of local error estimate strategies as a basis for refinement strategies is discussed, as well as the refinement strategies themselves. The application of the strategies to triangular elements and a finite-element flux-corrected-transport numerical scheme are presented. The implementation of these strategies in the GIM/PAGE code for 2-D and 3-D applications is documented and demonstrated
Parallel Graph Partitioning for Complex Networks
Processing large complex networks like social networks or web graphs has
recently attracted considerable interest. In order to do this in parallel, we
need to partition them into pieces of about equal size. Unfortunately, previous
parallel graph partitioners originally developed for more regular mesh-like
networks do not work well for these networks. This paper addresses this problem
by parallelizing and adapting the label propagation technique originally
developed for graph clustering. By introducing size constraints, label
propagation becomes applicable for both the coarsening and the refinement phase
of multilevel graph partitioning. We obtain very high quality by applying a
highly parallel evolutionary algorithm to the coarsened graph. The resulting
system is both more scalable and achieves higher quality than state-of-the-art
systems like ParMetis or PT-Scotch. For large complex networks the performance
differences are very big. For example, our algorithm can partition a web graph
with 3.3 billion edges in less than sixteen seconds using 512 cores of a high
performance cluster while producing a high quality partition -- none of the
competing systems can handle this graph on our system.Comment: Review article. Parallelization of our previous approach
arXiv:1402.328
- …