23 research outputs found
A Novel Coarsening Method for Scalable and Efficient Mesh Generation
In this paper, we propose a novel mesh coarsening method called brick coarsening method. The proposed method can be used in conjunction with any graph partitioners and scales to very large meshes. This method reduces problem space by decomposing the original mesh into fixed-size blocks of nodes called bricks, layered in a similar way to conventional brick laying, and then assigning each node of the original mesh to appropriate brick. Our experiments indicate that the proposed method scales to very large meshes while allowing simple RCB partitioner to produce higher-quality partitions with significantly less edge cuts. Our results further indicate that the proposed brick-coarsening method allows more complicated partitioners like PT-Scotch to scale to very large problem size while still maintaining good partitioning performance with relatively good edge-cut metric. Graph partitioning is an important problem that has many scientific and engineering applications in such areas as VLSI design, scientific computing, and resource management. Given a graph G = (V,E), where V is the set of vertices and E is the set of edges, (k-way) graph partitioning problem is to partition the vertices of the graph (V) into k disjoint groups such that each group contains roughly equal number of vertices and the number of edges connecting vertices in different groups is minimized. Graph partitioning plays a key role in large scientific computing, especially in mesh-based computations, as it is used as a tool to minimize the volume of communication and to ensure well-balanced load across computing nodes. The impact of graph partitioning on the reduction of communication can be easily seen, for example, in different iterative methods to solve a sparse system of linear equation. Here, a graph partitioning technique is applied to the matrix, which is basically a graph in which each edge is a non-zero entry in the matrix, to allocate groups of vertices to processors in such a way that many of matrix-vector multiplication can be performed locally on each processor and hence to minimize communication. Furthermore, a good graph partitioning scheme ensures the equal amount of computation performed on each processor. Graph partitioning is a well known NP-complete problem, and thus the most commonly used graph partitioning algorithms employ some forms of heuristics. These algorithms vary in terms of their complexity, partition generation time, and the quality of partitions, and they tend to trade off these factors. A significant challenge we are currently facing at the Lawrence Livermore National Laboratory is how to partition very large meshes on massive-size distributed memory machines like IBM BlueGene/P, where scalability becomes a big issue. For example, we have found that the ParMetis, a very popular graph partitioning tool, can only scale to 16K processors. An ideal graph partitioning method on such an environment should be fast and scale to very large meshes, while producing high quality partitions. This is an extremely challenging task, as to scale to that level, the partitioning algorithm should be simple and be able to produce partitions that minimize inter-processor communications and balance the load imposed on the processors. Our goals in this work are two-fold: (1) To develop a new scalable graph partitioning method with good load balancing and communication reduction capability. (2) To study the performance of the proposed partitioning method on very large parallel machines using actual data sets and compare the performance to that of existing methods. The proposed method achieves the desired scalability by reducing the mesh size. For this, it coarsens an input mesh into a smaller size mesh by coalescing the vertices and edges of the original mesh into a set of mega-vertices and mega-edges. A new coarsening method called brick algorithm is developed in this research. In the brick algorithm, the zones in a given mesh are first grouped into fixed size blocks called bricks. These brick are then laid in a way similar to conventional brick laying technique, which reduces the number of neighboring blocks each block needs to communicate. Contributions of this research are as follows: (1) We have developed a novel method that scales to a really large problem size while producing high quality mesh partitions; (2) We measured the performance and scalability of the proposed method on a machine of massive size using a set of actual large complex data sets, where we have scaled to a mesh with 110 million zones using our method. To the best of our knowledge, this is the largest complex mesh that a partitioning method is successfully applied to; and (3) We have shown that proposed method can reduce the number of edge cuts by as much as 65%
Recommended from our members
Parallel Clustering Algorithms for Structured AMR
We compare several different parallel implementation approaches for the clustering operations performed during adaptive gridding operations in patch-based structured adaptive mesh refinement (SAMR) applications. Specifically, we target the clustering algorithm of Berger and Rigoutsos (BR91), which is commonly used in many SAMR applications. The baseline for comparison is a simplistic parallel extension of the original algorithm that works well for up to O(10{sup 2}) processors. Our goal is a clustering algorithm for machines of up to O(10{sup 5}) processors, such as the 64K-processor IBM BlueGene/Light system. We first present an algorithm that avoids the unneeded communications of the simplistic approach to improve the clustering speed by up to an order of magnitude. We then present a new task-parallel implementation to further reduce communication wait time, adding another order of magnitude of improvement. The new algorithms also exhibit more favorable scaling behavior for our test problems. Performance is evaluated on a number of large scale parallel computer systems, including a 16K-processor BlueGene/Light system
Parallel block structured adaptive mesh refinement on graphics processing units.
Block-structured adaptive mesh refinement is a technique that can be used when solving partial differential equations to reduce the number of zones necessary to achieve the required accuracy in areas of interest. These areas (shock fronts, material interfaces, etc.) are recursively covered with finer mesh patches that are grouped into a hierarchy of refinement levels. Despite the potential for large savings in computational requirements and memory usage without a corresponding reduction in accuracy, AMR adds overhead in managing the mesh hierarchy, adding complex communication and data movement requirements to a simulation. In this paper, we describe the design and implementation of a native GPU-based AMR library, including: the classes used to manage data on a mesh patch, the routines used for transferring data between GPUs on different nodes, and the data-parallel operators developed to coarsen and refine mesh data. We validate the performance and accuracy of our implementation using three test problems and two architectures: an eight-node cluster, and over four thousand nodes of Oak Ridge National Laboratory’s Titan supercomputer. Our GPU-based AMR hydrodynamics code performs up to 4.87x faster than the CPU-based implementation, and has been scaled to over four thousand GPUs using a combination of MPI and CUDA
Recommended from our members
Modeling NIF Experimental Designs with Adaptive Mesh Refinement and Lagrangian Hydrodynamics
Incorporation of adaptive mesh refinement (AMR) into Lagrangian hydrodynamics algorithms allows for the creation of a highly powerful simulation tool effective for complex target designs with three-dimensional structure. We are developing an advanced modeling tool that includes AMR and traditional arbitrary Lagrangian-Eulerian (ALE) techniques. Our goal is the accurate prediction of vaporization, disintegration and fragmentation in National Ignition Facility (NIF) experimental target elements. Although our focus is on minimizing the generation of shrapnel in target designs and protecting the optics, the general techniques are applicable to modern advanced targets that include three-dimensional effects such as those associated with capsule fill tubes. Several essential computations in ordinary radiation hydrodynamics need to be redesigned in order to allow for AMR to work well with ALE, including algorithms associated with radiation transport. Additionally, for our goal of predicting fragmentation, we include elastic/plastic flow into our computations. We discuss the integration of these effects into a new ALE-AMR simulation code. Applications of this newly developed modeling tool as well as traditional ALE simulations in two and three dimensions are applied to NIF early-light target designs
Recommended from our members
Interface Reconstruction in Two-and Three-Dimensional Arbitrary Lagrangian-Euderian Adaptive Mesh Refinement Simulations
Modeling of high power laser and ignition facilities requires new techniques because of the higher energies and higher operational costs. We report on the development and application of a new interface reconstruction algorithm for chamber modeling code that combines ALE (Arbitrary Lagrangian Eulerian) techniques with AMR (Adaptive Mesh Refinement). The code is used for the simulation of complex target elements in the National Ignition Facility (NIF) and other similar facilities. The interface reconstruction scheme is required to adequately describe the debris/shrapnel (including fragments or droplets) resulting from energized materials that could affect optics or diagnostic sensors. Traditional ICF modeling codes that choose to implement ALE + AMR techniques will also benefit from this new scheme. The ALE formulation requires material interfaces (including those of generated particles or droplets) to be tracked. We present the interface reconstruction scheme developed for NIF's ALE-AMR and discuss how it is affected by adaptive mesh refinement and the ALE mesh. Results of the code are shown for NIF and OMEGA target configurations
Recommended from our members
Hierarchical Material Models for Fragmentation Modeling in NIF-ALE-AMR
Fragmentation is a fundamental process that naturally spans micro to macroscopic scales. Recent advances in algorithms, computer simulations, and hardware enable us to connect the continuum to microstructural regimes in a real simulation through a heterogeneous multiscale mathematical model. We apply this model to the problem of predicting how targets in the NIF chamber dismantle, so that optics and diagnostics can be protected from damage. The mechanics of the initial material fracture depend on the microscopic grain structure. In order to effectively simulate the fragmentation, this process must be modeled at the subgrain level with computationally expensive crystal plasticity models. However, there are not enough computational resources to model the entire NIF target at this microscopic scale. In order to accomplish these calculations, a hierarchical material model (HMM) is being developed. The HMM will allow fine-scale modeling of the initial fragmentation using computationally expensive crystal plasticity, while the elements at the mesoscale can use polycrystal models, and the macroscopic elements use analytical flow stress models. The HMM framework is built upon an adaptive mesh refinement (AMR) capability. We present progress in implementing the HMM in the NIF-ALE-AMR code. Additionally, we present test simulations relevant to NIF targets
Relativistic MHD with Adaptive Mesh Refinement
This paper presents a new computer code to solve the general relativistic
magnetohydrodynamics (GRMHD) equations using distributed parallel adaptive mesh
refinement (AMR). The fluid equations are solved using a finite difference
Convex ENO method (CENO) in 3+1 dimensions, and the AMR is Berger-Oliger.
Hyperbolic divergence cleaning is used to control the
constraint. We present results from three flat space tests, and examine the
accretion of a fluid onto a Schwarzschild black hole, reproducing the Michel
solution. The AMR simulations substantially improve performance while
reproducing the resolution equivalent unigrid simulation results. Finally, we
discuss strong scaling results for parallel unigrid and AMR runs.Comment: 24 pages, 14 figures, 3 table
Recommended from our members
Hierarchical Material Models for Fragmentation Modeling in NIF-ALE-AMR
Fragmentation is a fundamental process that naturally spans micro to macroscopic scales. Recent advances in algorithms, computer simulations, and hardware enable us to connect the continuum to microstructural regimes in a real simulation through a heterogeneous multiscale mathematical model. We apply this model to the problem of predicting how targets in the NIF chamber dismantle, so that optics and diagnostics can be protected from damage. The mechanics of the initial material fracture depend on the microscopic grain structure. In order to effectively simulate the fragmentation, this process must be modeled at the subgrain level with computationally expensive crystal plasticity models. However, there are not enough computational resources to model the entire NIF target at this microscopic scale. In order to accomplish these calculations, a hierarchical material model (HMM) is being developed. The HMM will allow fine-scale modeling of the initial fragmentation using computationally expensive crystal plasticity, while the elements at the mesoscale can use polycrystal models, and the macroscopic elements use analytical flow stress models. The HMM framework is built upon an adaptive mesh refinement (AMR) capability. We present progress in implementing the HMM in the NIF-ALE-AMR code. Additionally, we present test simulations relevant to NIF targets
Recommended from our members
Experiments for the Validation of Debris and Shrapnel Calculations
The debris and shrapnel generated by laser targets are important factors in the operation of a large laser facility such as NIF, LMJ, and Orion. Past experience has shown that it is possible for such target debris to render diagnostics inoperable and also to penetrate or damage optical protection (debris) shields. We are developing the tools to allow evaluation of target configurations in order to better mitigate the generation and impact of debris, including development of dedicated modeling codes. In order to validate these predictive simulations, we briefly describe a series of experiments aimed at determining the amount of debris and/or shrapnel produced in controlled situations. We use glass and aerogel to capture generated debris/shrapnel. The experimental targets include hohlraums (halfraums) and thin foils in a variety of geometries. Post-shot analysis includes scanning electron microscopy and x-ray tomography. We show the results of some of these experiments and discuss modeling efforts