1,488 research outputs found

    Recursive Algorithms for Distributed Forests of Octrees

    Get PDF
    The forest-of-octrees approach to parallel adaptive mesh refinement and coarsening (AMR) has recently been demonstrated in the context of a number of large-scale PDE-based applications. Although linear octrees, which store only leaf octants, have an underlying tree structure by definition, it is not often exploited in previously published mesh-related algorithms. This is because the branches are not explicitly stored, and because the topological relationships in meshes, such as the adjacency between cells, introduce dependencies that do not respect the octree hierarchy. In this work we combine hierarchical and topological relationships between octree branches to design efficient recursive algorithms. We present three important algorithms with recursive implementations. The first is a parallel search for leaves matching any of a set of multiple search criteria. The second is a ghost layer construction algorithm that handles arbitrarily refined octrees that are not covered by previous algorithms, which require a 2:1 condition between neighboring leaves. The third is a universal mesh topology iterator. This iterator visits every cell in a domain partition, as well as every interface (face, edge and corner) between these cells. The iterator calculates the local topological information for every interface that it visits, taking into account the nonconforming interfaces that increase the complexity of describing the local topology. To demonstrate the utility of the topology iterator, we use it to compute the numbering and encoding of higher-order C0C^0 nodal basis functions. We analyze the complexity of the new recursive algorithms theoretically, and assess their performance, both in terms of single-processor efficiency and in terms of parallel scalability, demonstrating good weak and strong scaling up to 458k cores of the JUQUEEN supercomputer.Comment: 35 pages, 15 figures, 3 table

    A Scalable and Modular Software Architecture for Finite Elements on Hierarchical Hybrid Grids

    Full text link
    In this article, a new generic higher-order finite-element framework for massively parallel simulations is presented. The modular software architecture is carefully designed to exploit the resources of modern and future supercomputers. Combining an unstructured topology with structured grid refinement facilitates high geometric adaptability and matrix-free multigrid implementations with excellent performance. Different abstraction levels and fully distributed data structures additionally ensure high flexibility, extensibility, and scalability. The software concepts support sophisticated load balancing and flexibly combining finite element spaces. Example scenarios with coupled systems of PDEs show the applicability of the concepts to performing geophysical simulations.Comment: Preprint of an article submitted to International Journal of Parallel, Emergent and Distributed Systems (Taylor & Francis

    Vectorizable algorithms for adaptive schemes for rapid analysis of SSME flows

    Get PDF
    An initial study into vectorizable algorithms for use in adaptive schemes for various types of boundary value problems is described. The focus is on two key aspects of adaptive computational methods which are crucial in the use of such methods (for complex flow simulations such as those in the Space Shuttle Main Engine): the adaptive scheme itself and the applicability of element-by-element matrix computations in a vectorizable format for rapid calculations in adaptive mesh procedures

    Refficientlib: an efficient load-rebalanced adaptive mesh refinement algorithm for high-performance computational physics meshes

    Get PDF
    No separate or additional fees are collected for access to or distribution of the work.In this paper we present a novel algorithm for adaptive mesh refinement in computational physics meshes in a distributed memory parallel setting. The proposed method is developed for nodally based parallel domain partitions where the nodes of the mesh belong to a single processor, whereas the elements can belong to multiple processors. Some of the main features of the algorithm presented in this paper are its capability of handling multiple types of elements in two and three dimensions (triangular, quadrilateral, tetrahedral, and hexahedral), the small amount of memory required per processor, and the parallel scalability up to thousands of processors. The presented algorithm is also capable of dealing with nonbalanced hierarchical refinement, where multirefinement level jumps are possible between neighbor elements. An algorithm for dealing with load rebalancing is also presented, which allows us to move the hierarchical data structure between processors so that load unbalancing is kept below an acceptable level at all times during the simulation. A particular feature of the proposed algorithm is that arbitrary renumbering algorithms can be used in the load rebalancing step, including both graph partitioning and space-filling renumbering algorithms. The presented algorithm is packed in the Fortran 2003 object oriented library \textttRefficientLib, whose interface calls which allow it to be used from any computational physics code are summarized. Finally, numerical experiments illustrating the performance and scalability of the algorithm are presented.Peer ReviewedPostprint (published version

    TetSplat: Real-time Rendering and Volume Clipping of Large Unstructured Tetrahedral Meshes

    Get PDF
    We present a novel approach to interactive visualization and exploration of large unstructured tetrahedral meshes. These massive 3D meshes are used in mission-critical CFD and structural mechanics simulations, and typically sample multiple field values on several millions of unstructured grid points. Our method relies on the pre-processing of the tetrahedral mesh to partition it into non-convex boundaries and internal fragments that are subsequently encoded into compressed multi-resolution data representations. These compact hierarchical data structures are then adaptively rendered and probed in real-time on a commodity PC. Our point-based rendering algorithm, which is inspired by QSplat, employs a simple but highly efficient splatting technique that guarantees interactive frame-rates regardless of the size of the input mesh and the available rendering hardware. It furthermore allows for real-time probing of the volumetric data-set through constructive solid geometry operations as well as interactive editing of color transfer functions for an arbitrary number of field values. Thus, the presented visualization technique allows end-users for the first time to interactively render and explore very large unstructured tetrahedral meshes on relatively inexpensive hardware

    Refficientlib: an efficient load-rebalanced adaptive mesh refinement algorithm for high-performance computational physics meshes

    Get PDF
    In this paper we present a novel algorithm for adaptive mesh refinement in computational physics meshes in a distributed memory parallel setting. The proposed method is developed for nodally based parallel domain partitions where the nodes of the mesh belong to a single processor, whereas the elements can belong to multiple processors. Some of the main features of the algorithm presented in this paper are its capability of handling multiple types of elements in two and three dimensions (triangular, quadrilateral, tetrahedral, and hexahedral), the small amount of memory required per processor, and the parallel scalability up to thousands of processors. The presented algorithm is also capable of dealing with nonbalanced hierarchical refinement, where multirefinement level jumps are possible between neighbor elements. An algorithm for dealing with load rebalancing is also presented, which allows us to move the hierarchical data structure between processors so that load unbalancing is kept below an acceptable level at all times during the simulation. A particular feature of the proposed algorithm is that arbitrary renumbering algorithms can be used in the load rebalancing step, including both graph partitioning and space-filling renumbering algorithms. The presented algorithm is packed in the Fortran 2003 object oriented library \textttRefficientLib, whose interface calls which allow it to be used from any computational physics code are summarized. Finally, numerical experiments illustrating the performance and scalability of the algorithm are presented. No separate or additional fees are collected for access to or distribution of the wor
    • …
    corecore