51 research outputs found

    Computational complexity and memory usage for multi-frontal direct solvers in structured mesh finite elements

    Full text link
    The multi-frontal direct solver is the state-of-the-art algorithm for the direct solution of sparse linear systems. This paper provides computational complexity and memory usage estimates for the application of the multi-frontal direct solver algorithm on linear systems resulting from B-spline-based isogeometric finite elements, where the mesh is a structured grid. Specifically we provide the estimates for systems resulting from Cp1C^{p-1} polynomial B-spline spaces and compare them to those obtained using C0C^0 spaces.Comment: 8 pages, 2 figure

    Unified modeling language description of the object-oriented multi-scale adaptive finite element method for step-and-flash imprint lithography simulations

    Get PDF
    In the first part of the paper we present the multi-scale simulation of the Step-and-Flash Imprint Lithography (SFIL), a modern patterning process. The simulation utilizes the hp adaptive Finite Element Method (hp-FEM) coupled with Molecular Statics (MS) model. Thus, we consider the multi-scale problem, with molecular statics applied in the areas of the mesh where the highest accuracy is required, and the continuous linear elasticity with thermal expansion coefficient applied in the remaining part of the domain. The degrees of freedom from macro-scale element's nodes located on the macro-scale side of the interface have been identified with particles from nano-scale elements located on the nano-scale side of the interface. In the second part of the paper we present Unified Modeling Language (UML) description of the resulting multi-scale application (hp-FEM coupled with MS). We investigated classical, procedural codes from the point of view of the object-oriented (O-O) programming paradigm. The discovered hierarchical structure of classes and algorithms makes the UML project as independent on the spatial dimension of the problem as possible. The O-O UML project was defined at an abstract level, independent on the programming language used

    Performance of a multi-frontal parallel direct solver for hp-finite element method

    Get PDF
    In this paper we present the performance of our parallel multi-frontal direct solver when applied to solve linear systems of equations resulting from discretizations of a hp Finite Element Method (hp-FEM). The hp-FEM generates a sequence of computational meshes delivering exponential convergence of the numerical error with respect to the mesh size (number of degrees of freedom). A sequence of meshes is obtained by performing several hp refinements starting from an arbitrary initial mesh. The solver constructs initial elimination tree for an arbitrary initial mesh, and expands the elimination tree each time the mesh is refined. The solver has been tested on 3D Direct Current (DC) borehole resistivity measurement simulations problems. We compare the solver with two versions of the MUMPS parallel solver: with (1) distributed entries executed over the entire problem, and (2) the direct sub-structuring method with parallel MUMPS solver utilized to solve the interface problem. We show that by providing to the solver the knowledge about the structure of the hp-FEM, the order of elimination is obtained straightforward, and leads to a better performance than by submitting the entire matrix to the solver and executing a connectivity graph based ordering algorithm

    Direct solvers performance on h-adapted grids

    Get PDF
    We analyse the performance of direct solvers when applied to a system of linear equations arising from an h-adapted, <sup>C0</sup> finite element space. Theoretical estimates are derived for typical h-refinement patterns arising as a result of a point, edge, or face singularity as well as boundary layers. They are based on the elimination trees constructed specifically for the considered grids. Theoretical estimates are compared with experiments performed with MUMPS using the nested-dissection algorithm for construction of the elimination tree from METIS library. The numerical experiments provide the same performance for the cases where our trees are identical with those constructed by the nested-dissection algorithm, and worse performance for some cases where our trees are different. We also present numerical experiments for the cases with mixed singularities, where how to construct optimal elimination trees is unknown. In all analysed cases, the use of h-adaptive grids significantly reduces the cost of the direct solver algorithm per unknown as compared to uniform grids. The theoretical estimates predict and the experimental data confirm that the computational complexity is linear for various refinement patterns. In most cases, the cost of the direct solver per unknown is lower when employing anisotropic refinements as opposed to isotropic ones

    Graph grammar-based multi-frontal parallel direct solver for two-dimensional isogeometric analysis

    Get PDF
    This paper introduces the graph grammar based model for developing multi-thread multi-frontal parallel direct solver for two dimensional isogeometric finite element method. Execution of the solver algorithm has been expressed as the sequence of graph grammar productions. At the beginning productions construct the elimination tree with leaves corresponding to finite elements. Following sequence of graph grammar productions generates element frontal matrices at leaf nodes, merges matrices at parent nodes and eliminates rows corresponding to fully assembled degrees of freedom. Finally, there are graph grammar productions responsible for root problem solution and recursive backward substitutions. Expressing the solver algorithm by graph grammar productions allows us to explore the concurrency of the algorithm. The graph grammar productions are grouped into sets of independent tasks that can be executed concurrently. The resulting concurrent multi-frontal solver algorithm is implemented and tested on NVIDIA GPU, providing O(NlogN) execution time complexity where N is the number of degrees of freedom. We have confirmed this complexity by solving up to 1 million of degrees of freedom with 448 cores GPU. © 2012 Published by Elsevier Ltd

    Goal-oriented self-adaptive hp finite element simulation of 3D DC borehole resistivity simulations

    Get PDF
    In this paper we present a goal-oriented self-adaptive hp Finite Element Method (hp-FEM) with shared data structures and a parallel multi-frontal direct solver. The algorithm automatically generates (without any user interaction) a sequence of meshes delivering exponential convergence of a prescribed quantity of interest with respect to the number of degrees of freedom. The sequence of meshes is generated from a given initial mesh, by performing h (breaking elements into smaller elements), p (adjusting polynomial orders of approximation) or hp (both) refinements on the finite elements. The new parallel implementation utilizes a computational mesh shared between multiple processors. All computational algorithms, including automatic hp goal-oriented adaptivity and the solver work fully in parallel. We describe the parallel self-adaptive hp-FEM algorithm with shared computational domain, as well as its efficiency measurements. We apply the methodology described to the three-dimensional simulation of the borehole resistivity measurement of direct current through casing in the presence of invasion. © 2011 Published by Elsevier Ltd

    An Agent-Oriented Hierarchic Strategy for Solving Inverse Problems

    Get PDF
    The paper discusses the complex, agent-oriented hierarchic memetic strategy (HMS) dedicated to solving inverse parametric problems. The strategy goes beyond the idea of two-phase global optimization algorithms. The global search performed by a tree of dependent demes is dynamically alternated with local, steepest descent searches. The strategy offers exceptionally low computational costs, mainly because the direct solver accuracy (performed by the hp-adaptive finite element method) is dynamically adjusted for each inverse search step. The computational cost is further decreased by the strategy employed for solution inter-processing and fitness deterioration. The HMS efficiency is compared with the results of a standard evolutionary technique, as well as with the multi-start strategy on benchmarks that exhibit typical inverse problems' difficulties. Finally, an HMS application to a real-life engineering problem leading to the identification of oil deposits by inverting magnetotelluric measurements is presented. The HMS applicability to the inversion of magnetotelluric data is also mathematically verified

    Dynamic programming algorithm for generation of optimal elimination trees for multi-frontal direct solver over h-refined grids

    Get PDF
    In this paper we present a dynamic programming algorithm for finding optimal elimination trees for computational grids refined towards point or edge singularities. The elimination tree is utilized to guide the multi-frontal direct solver algorithm. Thus, the criterion for the optimization of the elimination tree is the computational cost associated with the multi-frontal solver algorithm executed over such tree. We illustrate the paper with several examples of optimal trees found for grids with point, isotropic edge and anisotropic edge mixed with point singularity. We show the comparison of the execution time of the multi-frontal solver algorithm with results of MUMPS solver with METIS library, implementing the nested dissection algorithm. © The Authors. Published by Elsevier B.V

    A hybrid method for inversion of 3D DC resistivity logging measurements

    Get PDF
    This paper focuses on the application of hp hierarchic genetic strategy (hp-HGS) for solution of a challenging problem, the inversion of 3D direct current (DC) resistivity logging measurements. The problem under consideration has been formulated as the global optimization one, for which the objective function (misfit between computed and reference data) exhibits multiple minima. In this paper, we consider the extension of the hp-HGS strategy, namely we couple the hp-HGS algorithm with a gradient based optimization method for a local search. Forward simulations are performed with a self-adaptive hp finite element method, hp-FEM. The computational cost of misfit evaluation by hp-FEM depends strongly on the assumed accuracy. This accuracy is adapted to the tree of populations generated by the hp-HGS algorithm, which makes the global phase significantly cheaper. Moreover, tree structure of demes as well as branch reduction and conditional sprouting mechanism reduces the number of expensive local searches up to the number of minima to be recognized. The common (direct and inverse) accuracy control, crucial for the hp-HGS efficiency, has been motivated by precise mathematical considerations. Numerical results demonstrate the suitability of the proposed method for the inversion of 3D DC resistivity logging measurements

    Parallel refined Isogeometric Analysis in 3D

    Get PDF
    We study three-dimensional isogeometric analysis (IGA) and the solution of the resulting system of linear equations via a direct solver. IGA uses highly continuous Cp1C^{p-1} basis functions, which provide multiple benefits in terms of stability and convergence properties. However, smooth basis significantly deteriorate the direct solver performance and its parallel scalability. As a partial remedy for this, refined Isogeometric Analysis (rIGA) method improves the sequential execution of direct solvers. The refinement strategy enriches traditional highly-continuous Cp1C^{p-1} IGA spaces by introducing low-continuity C0C^{0} 0-hyperplanes along the boundaries of certain pre-defined macro-elements. In this work, propose a solution strategy for rIGA for parallel distributed memory machines and compare the computational costs of solving rIGA vs IGA discretizations. We verify our estimates with parallel numerical experiments. Results show that the weak parallel scalability of the direct solver improves approximately by a factor of p2p^{2} when considering rIGA discretizations rather than highly-continuous IGA spaces
    corecore