37 research outputs found

    Robust exact and inexact FETI-DP methods with applications to elasticity

    Get PDF
    Gebietszerlegungsverfahren sind parallele, iterative Lösungsverfahren für grosse Gleichungssysteme, die bei der Diskretisierung von partiellen Differentialgleichungen, etwa aus der Strukturmechanik, entstehen. In dieser Arbeit werden duale, iterative Substrukturierungsverfahren vom FETI-DP-Typ (Finite Element Tearing and Interconnecting Dual-Primal) entwickelt und auf elliptische partielle Differentialgleichungen zweiter Ordnung angewandt. Insbesondere wird versucht, robuste Verfahren für homogene und heterogene Elastizitaetsprobleme zu entwickeln. Ebenso werden neue, inexakte FETI-DP-Verfahren vorgestellt, die eine inexakte Lösung des Grobgitterproblems und/oder der Teilgebietsprobleme erlauben. Es wird gezeigt, dass die neuen Algorithmen unter bestimmten Voraussetzungen Abschätzungen der gleichen asymptotischen Güte wie das klassische, exakte FETI-DP-Verfahren erfüllen. Parallele Resultate unter Verwendung von algebraischen Mehrgitter für das Grobgitterproblem zeigen die verbesserte Skalierbarkeit der neuen Algorithmen.Domain decomposition methods are fast parallel solvers for large equation systems arising from the discretisation of partial differential equations, e.g. from structural mechanics. In this work, dual iterative substructuring methods of the FETI-DP (Finite Element Tearing and Interconnecting Dual-Primal) type are developed and applied to second order elliptic problems with emphasis on elasticity. An attempt is made to develop robust methods for homogeneous and heterogeneous problems. New inexact FETI-DP methods are also introduced that allow for inexact coarse problem solvers and/or inexact subdomain solvers. It is shown that under certain conditions the new algorithms fulfill the same asymptotic condition number estimate as the traditional, exact FETI-DP methods. Parallel results using algebraic multigrid for the FETI-DP coarse problem show the improved scalability of the new algorithms

    Nonlinear FETI-DP and BDDC Methods

    Get PDF
    In the simulation of deformation processes in material science the consideration of a microscopic material structure is often necessary, as in the simulation of modern high strength steels. A straightforward finite element discretization of the complete deformed body resolving the microscopic structure leads to very large nonlinear problems and a solution is out of reach, even on modern supercomputers. In homogenization approaches, as the computational scale bridging approach FE2, the macroscopic scale of the deformed object is decoupled from the microscopic scale of the material structure. These approaches only consider the microstructure in a localized fashion on independent and parallel representative volume elements (RVEs). This introduces massive parallelism on the macroscopic level and is thus ideal for modern computer architectures with large numbers of parallel computational cores. Nevertheless, the discretization of an RVE can still result in large nonlinear problems and thus highly scalable parallel solvers are necessary. In this context, nonlinear FETI-DP (Finite Element Tearing and Interconnecting - Dual-Primal) and BDDC (Balancing Domain Decomposition by Constraints) domain decomposition methods are discussed in this thesis, which are parallel solution methods for nonlinear problems arising from a finite element discretization. These approaches can be viewed as a strategies to further localize the computational work and to extend the parallel scalability of classical FETI-DP and BDDC methods towards extreme-scale supercomputers. Also variants providing an inexact solution of the FETI-DP coarse problem are considered in this thesis, combining two successful paradigms, i.e., nonlinear domain decomposition and AMG (Algebraic Multigrid). An efficient implementation of the resulting inexact reduced Nonlinear-FETI-DP-1 method is presented and scalability beyond 200,000 computational cores is showed. Finally, a highly scalable FE2 implementation using recent inexact reduced FETI-DP methods to solve the RVE problems on the microscopic level is presented and scalability on all 458,752 cores of the JUQUEEN BlueGene/Q system at Forschungszentrum Jülich is demonstrated

    Software concepts and algorithms for an efficient and scalable parallel finite element method

    Get PDF
    Software packages for the numerical solution of partial differential equations (PDEs) using the finite element method are important in different fields of research. The basic data structures and algorithms change in time, as the user\'s requirements are growing and the software must efficiently use the newest highly parallel computing systems. This is the central point of this work. To make efficiently use of parallel computing systems with growing number of independent basic computing units, i.e.~CPUs, we have to combine data structures and algorithms from different areas of mathematics and computer science. Two crucial parts are a distributed mesh and parallel solver for linear systems of equations. For both there exists multiple independent approaches. In this work we argue that it is necessary to combine both of them to allow for an efficient and scalable implementation of the finite element method. First, we present concepts, data structures and algorithms for distributed meshes, which allow for local refinement. The central point of our presentation is to provide arbitrary geometrical information of the mesh and its distribution to the linear solver. A large part of the overall computing time of the finite element method is spend by the linear solver. Thus, its parallelization is of major importance. Based on the presented concept for distributed meshes, we preset several different linear solver methods. Hereby we concentrate on general purpose linear solver, which makes only little assumptions about the systems to be solver. For this, a new FETI-DP (Finite Element Tearing and Interconnect - Dual Primal) method is proposed. Those the standard FETI-DP method is quasi optimal from a mathematical point of view, its not possible to implement it efficiently for a large number of processors (> 10,000). The main reason is a relatively small but globally distributed coarse mesh problem. To circumvent this problem, we propose a new multilevel FETI-DP method which hierarchically decompose the coarse grid problem. This leads to a more local communication pattern for solver the coarse grid problem and makes it possible to scale for a large number of processors. Besides the parallelization of the finite element method, we discuss an approach to speed up serial computations of existing finite element packages. In many computations the PDE to be solved consists of more than one variable. This is especially the case in multi-physics modeling. Observation show that in many of these computation the solution structure of the variables is different. But in the standard finite element method, only one mesh is used for the discretization of all variables. We present a multi-mesh finite element method, which allows to discretize a system of PDEs with two independently refined meshes.Softwarepakete zur numerischen Lösung partieller Differentialgleichungen mit Hilfe der Finiten-Element-Methode sind in vielen Forschungsbereichen ein wichtiges Werkzeug. Die dahinter stehenden Datenstrukturen und Algorithmen unterliegen einer ständigen Neuentwicklung um den immer weiter steigenden Anforderungen der Nutzergemeinde gerecht zu werden und um neue, hochgradig parallel Rechnerarchitekturen effizient nutzen zu können. Dies ist auch der Kernpunkt dieser Arbeit. Um parallel Rechnerarchitekturen mit einer immer höher werdenden Anzahl an von einander unabhängigen Recheneinheiten, z.B.~Prozessoren, effizient Nutzen zu können, müssen Datenstrukturen und Algorithmen aus verschiedenen Teilgebieten der Mathematik und Informatik entwickelt und miteinander kombiniert werden. Im Kern sind dies zwei Bereiche: verteilte Gitter und parallele Löser für lineare Gleichungssysteme. Für jedes der beiden Teilgebiete existieren unabhängig voneinander zahlreiche Ansätze. In dieser Arbeit wird argumentiert, dass für hochskalierbare Anwendungen der Finiten-Elemente-Methode nur eine Kombination beider Teilgebiete und die Verknüpfung der darunter liegenden Datenstrukturen eine effiziente und skalierbare Implementierung ermöglicht. Zuerst stellen wir Konzepte vor, die parallele verteile Gitter mit entsprechenden Adaptionstrategien ermöglichen. Zentraler Punkt ist hier die Informationsaufbereitung für beliebige Löser linearer Gleichungssysteme. Beim Lösen partieller Differentialgleichung mit der Finiten Elemente Methode wird ein großer Teil der Rechenzeit für das Lösen der dabei anfallenden linearen Gleichungssysteme aufgebracht. Daher ist deren Parallelisierung von zentraler Bedeutung. Basierend auf dem vorgestelltem Konzept für verteilten Gitter, welches beliebige geometrische Informationen für die linearen Löser aufbereiten kann, präsentieren wir mehrere unterschiedliche Lösermethoden. Besonders Gewicht wird dabei auf allgemeine Löser gelegt, die möglichst wenig Annahmen über das zu lösende System machen. Hierfür wird die FETI-DP (Finite Element Tearing and Interconnect - Dual Primal) Methode weiterentwickelt. Obwohl die FETI-DP Methode vom mathematischen Standpunkt her als quasi-optimal bezüglich der parallelen Skalierbarkeit gilt, kann sie für große Anzahl an Prozessoren (> 10.000) nicht mehr effizient implementiert werden. Dies liegt hauptsächlich an einem verhältnismäßig kleinem aber global verteilten Grobgitterproblem. Wir stellen eine Multilevel FETI-DP Methode vor, die dieses Problem durch eine hierarchische Komposition des Grobgitterproblems löst. Dadurch wird die Kommunikation entlang des Grobgitterproblems lokalisiert und die Skalierbarkeit der FETI-DP Methode auch für große Anzahl an Prozessoren sichergestellt. Neben der Parallelisierung der Finiten-Elemente-Methode beschäftigen wir uns in dieser Arbeit mit der Ausnutzung von bestimmten Voraussetzung um auch die sequentielle Effizienz bestehender Implementierung der Finiten-Elemente-Methode zu steigern. In vielen Fällen müssen partielle Differentialgleichungen mit mehreren Variablen gelöst werden. Sehr häufig ist dabei zu beobachten, insbesondere bei der Modellierung mehrere miteinander gekoppelter physikalischer Phänomene, dass die Lösungsstruktur der unterschiedlichen Variablen entweder schwach oder vollständig voneinander entkoppelt ist. In den meisten Implementierungen wird dabei nur ein Gitter zur Diskretisierung aller Variablen des Systems genutzt. Wir stellen eine Finite-Elemente-Methode vor, bei der zwei unabhängig voneinander verfeinerte Gitter genutzt werden können um ein System partieller Differentialgleichungen zu lösen

    Balancing domain decomposition by constraints algorithms for incompressible Stokes equations with nonconforming finite element discretizations

    Get PDF
    Hybridizable Discontinuous Galerkin (HDG) is an important family of methods, which combine the advantages of both Discontinuous Galerkin in terms of flexibility and standard finite elements in terms of accuracy and efficiency. The impact of this method is partly evidenced by the prolificacy of research work in this area. Weak Galerkin (WG) is a relatively newly proposed method by introducing weak functions and generalizing the differential operator for them. This method has also drawn remarkable interests from both numerical practitioners and analysts recently. HDG and WG are different but closely related. BDDC algorithms are developed for numerical solution of elliptic problems with both methods. We prove that the optimal condition number estimate for BDDC operators with standard finite element methods can be extended to the counterparts arising from the HDG and WG methods, which are nonconforming finite element methods. Numerical experiments are conducted to verify the theoretical analysis. Further, we propose BDDC algorithms for the saddle point system arising from the Stokes equations using both HDG and WG methods. By design of the preconditioner, the iterations are restricted to a benign subspace, which makes the BDDC operator effectively positive definite thus solvable by the conjugate gradient method. We prove that the algorithm is scalable in the number of subdomains with convergence rate only dependent on subdomain problem size. The condition number bound for the BDDC preconditioned Stokes system is the same as the optimal bound for the elliptic case. Numerical results confirm the theoretical analysis

    Parallel Algorithms for the Solution of Large-Scale Fluid-Structure Interaction Problems in Hemodynamics

    Get PDF
    This thesis addresses the development and implementation of efficient and parallel algorithms for the numerical simulation of Fluid-Structure Interaction (FSI) problems in hemodynamics. Indeed, hemodynamic conditions in large arteries are significantly affected by the interaction of the pulsatile blood flow with the arterial wall. The simulation of fluid-structure interaction problems requires the approximation of a coupled system of Partial Differential Equations (PDEs) and the set up of efficient numerical solution strategies. Blood is modeled as an incompressible Newtonian fluid whose dynamics is governed by the Navier-Stokes equations. Different constituive models are used to describe the mechanical response of the arterial wall; specifically, we rely on hyperelastic isotropic and anistotropic material laws. The finite element method is used for the space discretization of both the fluid and structure problems. In particular, for the Navier-Stokes equations we consider a semi-discrete formulation based on the Variational Multiscale (VMS) method. Among a wide range of possible solution strategies for the FSI problem, here we focus on strongly coupled monolithic approaches wherein the nonlinearities are treated in a fully implicit mode. To cope with the high computational complexity of the three dimensional FSI problem, a parallel solution framework is often mandatory. To this end, we develop a new block parallel preconditioner for the coupled linearized FSI system obtained after space and time discretization. The proposed preconditioner, named FaCSI, exploits the factorized form of the FSI Jacobian matrix, the use of static condensation to formally eliminate the interface degrees of freedom of the fluid equations, and the use of a SIMPLE preconditioner for unsteady Navier-Stokes equations. In FSI problems, the different resolution requirements in the fluid and structure physical domains, as well as the presence of complex interface geometries make the use of matching fluid and structure meshes problematic. In such situations, it is much simpler to deal with discretizations that are nonconforming at the interface, provided however that the matching conditions at the interface are properly fulfilled. In this thesis we develop a novel interpolation-based method, named INTERNODES, for numerically solving partial differential equations by Galerkin methods on computational domains that are split into two (or several) subdomains featuring nonconforming interfaces. By this we mean that either a priori independent grids and/or local polynomial degrees are used to discretize each subdomain. INTERNODES can be regarded as an alternative to the mortar element method: it combines the accuracy of the latter with the easiness of implementation in a numerical code. The aforementioned techniques have been applied for the numerical simulation of large-scale fluid-structure interaction problems in the context of biomechanics. The parallel algorithms developed showed scalability up to thousands of cores utilized on high performance computing machines

    Domain Decomposition Methods for Elastic Materials with Compressible and Almost Incompressible Components

    Get PDF
    Domain decomposition methods are iterative methods to solve large systems of equations, obtained, e.g., from finite element discretization. Here, the domain is decomposed into smaller subproblems, which can be solved in parallel. In the first part of this work, new condition number bounds are proven for a FETI-DP type (Finite Element Tearing and Interconnecting Dual-Primal) domain decomposition method for compressible linear elasticity in 3D. Each subdomain may contain an inclusion having different material properties. The condition number bound only depends on the subdomain diameter, the finite element diameter, and the thickness of the compressible hull. It is independent of the material parameters in the inclusions, thus almost incompressible inclusions are also possible. In the second part of this thesis a new coarse space for FETI-DP methods for almost incompressible linear elasticity on the whole domain is presented. This coarse space is much smaller than the standard coarse space for FETI-DP or BDDC methods for almost incompressible linear elasticity.Gebietszerlegungsalgorithmen sind iterative Verfahren zum Lösen großer Gleichungssysteme, die z. B. durch den Finite-Elemente-Ansatz entstehen. Dabei wird das Ausgangsproblem in kleinere Teilprobleme zerlegt, die dann parallel gelöst werden können. Im ersten Teil der Arbeit werden neue Konditionszahlabschätzungen für Gebietszerlegungsverfahren vom FETI-DP- Typ (Finite Element Tearing and Interconnecting Dual-Primal) für kompressible lineare Elastizitätsprobleme in 3D bewiesen, wobei in jedem Gebiet Einschlüsse mit anderen Materialparametern eingebettet sein können. Die Abschätzungen hängen dabei nur von dem typischen Teilgebietsdurchmesser, dem Finite-Elemente-Durchmesser und der Breite einer kompressiblen Hülle ab. Sie ist unabhängig von den Materialparametern in den Einschlüssen. Auch fast-inkompressible Einschlüsse sind möglich. Im zweiten Teil der Arbeit wird ein neuer Grobgitterraum für FETI-DP-Verfahren für fast-inkompressible Elastizität vorgestellt. Dieser Grobgitterraum ist erheblich kleiner als bisher bekannte Grobgitterräume für FETI-DP oder BDDC-Verfahren für fast-inkompressible lineare Elastizität

    Hybride 3D Simulationsmethoden zur Abbildung der Schädigungsvorgänge in Mehrphasen-Verbundwerkstoffen

    Get PDF
    Modern digital material approaches for the visualization and simulation of heterogeneous materials allow to investigate the behavior of complex multiphase materials with their physical nonlinear material response at various scales. However, these computational techniques require extensive hardware resources with respect to computing power and main memory to solve numerically large-scale discretized models in 3D. Due to a very high number of degrees of freedom, which may rapidly be increased to the two-digit million range, the limited hardware ressources are to be utilized in a most efficient way to enable an execution of the numerical algorithms in minimal computation time. Hence, in the field of computational mechanics, various methods and algorithms can lead to an optimized runtime behavior of nonlinear simulation models, where several approaches are proposed and investigated in this thesis. Today, the numerical simulation of damage effects in heterogeneous materials is performed by the adaption of multiscale methods. A consistent modeling in the three-dimensional space with an appropriate discretization resolution on each scale (based on a hierarchical or concurrent multiscale model), however, still contains computational challenges in respect to the convergence behavior, the scale transition or the solver performance of the weak coupled problems. The computational efficiency and the distribution among available hardware resources (often based on a parallel hardware architecture) can significantly be improved. In the past years, high-performance computing (HPC) and graphics processing unit (GPU) based computation techniques were established for the investigationof scientific objectives. Their application results in the modification of existing and the development of new computational methods for the numerical implementation, which enables to take advantage of massively clustered computer hardware resources. In the field of numerical simulation in material science, e.g. within the investigation of damage effects in multiphase composites, the suitability of such models is often restricted by the number of degrees of freedom (d.o.f.s) in the three-dimensional spatial discretization. This proves to be difficult for the type of implementation method used for the nonlinear simulation procedure and, simultaneously has a great influence on memory demand and computational time. In this thesis, a hybrid discretization technique has been developed for the three-dimensional discretization of a three-phase material, which is respecting the numerical efficiency of nonlinear (damage) simulations of these materials. The increase of the computational efficiency is enabled by the improved scalability of the numerical algorithms. Consequently, substructuring methods for partitioning the hybrid mesh were implemented, tested and adapted to the HPC computing framework using several hundred CPU (central processing units) nodes for building the finite element assembly. A memory-efficient iterative and parallelized equation solver combined with a special preconditioning technique for solving the underlying equation system was modified and adapted to enable combined CPU and GPU based computations. Hence, it is recommended by the author to apply the substructuring method for hybrid meshes, which respects different material phases and their mechanical behavior and which enables to split the structure in elastic and inelastic parts. However, the consideration of the nonlinear material behavior, specified for the corresponding phase, is limited to the inelastic domains only, and by that causes a decreased computing time for the nonlinear procedure. Due to the high numerical effort for such simulations, an alternative approach for the nonlinear finite element analysis, based on the sequential linear analysis, was implemented in respect to scalable HPC. The incremental-iterative procedure in finite element analysis (FEA) during the nonlinear step was then replaced by a sequence of linear FE analysis when damage in critical regions occured, known in literature as saw-tooth approach. As a result, qualitative (smeared) crack initiation in 3D multiphase specimens has efficiently been simulated
    corecore