    Adaptive BDDC in Three Dimensions

    The adaptive BDDC method is extended to the selection of face constraints in three dimensions. A new implementation of the BDDC method is presented based on a global formulation without an explicit coarse problem, with massive parallelism provided by a multifrontal solver. Constraints are implemented by a projection and sparsity of the projected operator is preserved by a generalized change of variables. The effectiveness of the method is illustrated on several engineering problems.Comment: 28 pages, 9 figures, 9 table

    Parallel adaptive FETI-DP using lightweight asynchronous dynamic load balancing

    A parallel FETI-DP domain decomposition method using an adaptive coarse space is presented. The implementation builds on a recently introduced adaptive FETI-DP approach for elliptic problems in three dimensions and uses small, local eigenvalue problems for faces and, additionally, for a small number of edges. The condition number of the preconditioned operator then satisfies a bound which is independent of coefficient heterogeneities in the problem. The computational cost of the local eigenvalue problems is not negligible, and also a significant load imbalance can be introduced. As a remedy, certain eigenvalue problems are discarded by a theory-guided heuristic strategy, based on the diagonal entries of the stiffness matrices. Additionally, a lightweight pairwise dynamic load balancing strategy is implemented for the eigenvalue problems. The load balancing is supervised by an orchestrating rank using asynchronous point-to-point communication. The resulting method shows good weak and strong scalability up to thousands of cores while fast convergence is obtained even for heterogeneous problems

    Balancing domain decomposition by constraints and perturbation

    In this paper, we formulate and analyze a perturbed formulation of the balancing domain decomposition by constraints (BDDC) method. We prove that the perturbed BDDC has the same polylogarithmic bound for the condition number as the standard formulation. Two types of properly scaled zero-order perturbations are considered: one uses a mass matrix, and the other uses a Robin-type boundary condition, i.e, a mass matrix on the interface. With perturbation, the wellposedness of the local Neumann problems and the global coarse problem is automatically guaranteed, and coarse degrees of freedom can be defined only for convergence purposes but not well-posedness. This allows a much simpler implementation as no complicated corner selection algorithm is needed. Minimal coarse spaces using only face or edge constraints can also be considered. They are very useful in extreme scale calculations where the coarse problem is usually the bottleneck that can jeopardize scalability. The perturbation also adds extra robustness as the perturbed formulation works even when the constraints fail to eliminate a small number of subdomain rigid body modes from the standard BDDC space. This is extremely important when solving problems on unstructured meshes partitioned by automatic graph partitioners since arbitrary disconnected subdomains are possible. Numerical results are provided to support the theoretical findings.Peer ReviewedPostprint (published version

    Combining Machine Learning and Adaptive Coarse Spaces - A Hybrid Approach for Robust FETI-DP Methods in Three Dimensions

    The hybrid ML-FETI-DP algorithm combines the advantages of adaptive coarse spaces in domain decomposition methods and certain supervised machine learning techniques. Adaptive coarse spaces ensure robustness of highly scalable domain decomposition solvers, even for highly heterogeneous coefficient distributions with arbitrary coefficient jumps. However, their construction requires the setup and solution of local generalized eigenvalue problems, which is typically computationally expensive. The idea of ML-FETI-DP is to interpret the coefficient distribution as image data and predict whether an eigenvalue problem has to be solved or can be neglected while still maintaining robustness of the adaptive FETI-DP method. For this purpose, neural networks are used as image classifiers. In the present work, the ML-FETI-DP algorithm is extended to three dimensions, which requires both a complex data preprocessing procedure to construct consistent input data for the neural network as well as a representative training and validation data set to ensure generalization properties of the machine learning model. Numerical experiments for stationary diffusion and linear elasticity problems with realistic coefficient distributions show that a large number of eigenvalue problems can be saved; in the best case of the numerical results presented here, 97% of the eigenvalue problems can be avoided to be set up and solved

    Software concepts and algorithms for an efficient and scalable parallel finite element method

    Software packages for the numerical solution of partial differential equations (PDEs) using the finite element method are important in different fields of research. The basic data structures and algorithms change in time, as the user\'s requirements are growing and the software must efficiently use the newest highly parallel computing systems. This is the central point of this work. To make efficiently use of parallel computing systems with growing number of independent basic computing units, i.e.~CPUs, we have to combine data structures and algorithms from different areas of mathematics and computer science. Two crucial parts are a distributed mesh and parallel solver for linear systems of equations. For both there exists multiple independent approaches. In this work we argue that it is necessary to combine both of them to allow for an efficient and scalable implementation of the finite element method. First, we present concepts, data structures and algorithms for distributed meshes, which allow for local refinement. The central point of our presentation is to provide arbitrary geometrical information of the mesh and its distribution to the linear solver. A large part of the overall computing time of the finite element method is spend by the linear solver. Thus, its parallelization is of major importance. Based on the presented concept for distributed meshes, we preset several different linear solver methods. Hereby we concentrate on general purpose linear solver, which makes only little assumptions about the systems to be solver. For this, a new FETI-DP (Finite Element Tearing and Interconnect - Dual Primal) method is proposed. Those the standard FETI-DP method is quasi optimal from a mathematical point of view, its not possible to implement it efficiently for a large number of processors (> 10,000). The main reason is a relatively small but globally distributed coarse mesh problem. To circumvent this problem, we propose a new multilevel FETI-DP method which hierarchically decompose the coarse grid problem. This leads to a more local communication pattern for solver the coarse grid problem and makes it possible to scale for a large number of processors. Besides the parallelization of the finite element method, we discuss an approach to speed up serial computations of existing finite element packages. In many computations the PDE to be solved consists of more than one variable. This is especially the case in multi-physics modeling. Observation show that in many of these computation the solution structure of the variables is different. But in the standard finite element method, only one mesh is used for the discretization of all variables. We present a multi-mesh finite element method, which allows to discretize a system of PDEs with two independently refined meshes.Softwarepakete zur numerischen Lösung partieller Differentialgleichungen mit Hilfe der Finiten-Element-Methode sind in vielen Forschungsbereichen ein wichtiges Werkzeug. Die dahinter stehenden Datenstrukturen und Algorithmen unterliegen einer ständigen Neuentwicklung um den immer weiter steigenden Anforderungen der Nutzergemeinde gerecht zu werden und um neue, hochgradig parallel Rechnerarchitekturen effizient nutzen zu können. Dies ist auch der Kernpunkt dieser Arbeit. Um parallel Rechnerarchitekturen mit einer immer höher werdenden Anzahl an von einander unabhängigen Recheneinheiten, z.B.~Prozessoren, effizient Nutzen zu können, müssen Datenstrukturen und Algorithmen aus verschiedenen Teilgebieten der Mathematik und Informatik entwickelt und miteinander kombiniert werden. Im Kern sind dies zwei Bereiche: verteilte Gitter und parallele Löser für lineare Gleichungssysteme. Für jedes der beiden Teilgebiete existieren unabhängig voneinander zahlreiche Ansätze. In dieser Arbeit wird argumentiert, dass für hochskalierbare Anwendungen der Finiten-Elemente-Methode nur eine Kombination beider Teilgebiete und die Verknüpfung der darunter liegenden Datenstrukturen eine effiziente und skalierbare Implementierung ermöglicht. Zuerst stellen wir Konzepte vor, die parallele verteile Gitter mit entsprechenden Adaptionstrategien ermöglichen. Zentraler Punkt ist hier die Informationsaufbereitung für beliebige Löser linearer Gleichungssysteme. Beim Lösen partieller Differentialgleichung mit der Finiten Elemente Methode wird ein großer Teil der Rechenzeit für das Lösen der dabei anfallenden linearen Gleichungssysteme aufgebracht. Daher ist deren Parallelisierung von zentraler Bedeutung. Basierend auf dem vorgestelltem Konzept für verteilten Gitter, welches beliebige geometrische Informationen für die linearen Löser aufbereiten kann, präsentieren wir mehrere unterschiedliche Lösermethoden. Besonders Gewicht wird dabei auf allgemeine Löser gelegt, die möglichst wenig Annahmen über das zu lösende System machen. Hierfür wird die FETI-DP (Finite Element Tearing and Interconnect - Dual Primal) Methode weiterentwickelt. Obwohl die FETI-DP Methode vom mathematischen Standpunkt her als quasi-optimal bezüglich der parallelen Skalierbarkeit gilt, kann sie für große Anzahl an Prozessoren (> 10.000) nicht mehr effizient implementiert werden. Dies liegt hauptsächlich an einem verhältnismäßig kleinem aber global verteilten Grobgitterproblem. Wir stellen eine Multilevel FETI-DP Methode vor, die dieses Problem durch eine hierarchische Komposition des Grobgitterproblems löst. Dadurch wird die Kommunikation entlang des Grobgitterproblems lokalisiert und die Skalierbarkeit der FETI-DP Methode auch für große Anzahl an Prozessoren sichergestellt. Neben der Parallelisierung der Finiten-Elemente-Methode beschäftigen wir uns in dieser Arbeit mit der Ausnutzung von bestimmten Voraussetzung um auch die sequentielle Effizienz bestehender Implementierung der Finiten-Elemente-Methode zu steigern. In vielen Fällen müssen partielle Differentialgleichungen mit mehreren Variablen gelöst werden. Sehr häufig ist dabei zu beobachten, insbesondere bei der Modellierung mehrere miteinander gekoppelter physikalischer Phänomene, dass die Lösungsstruktur der unterschiedlichen Variablen entweder schwach oder vollständig voneinander entkoppelt ist. In den meisten Implementierungen wird dabei nur ein Gitter zur Diskretisierung aller Variablen des Systems genutzt. Wir stellen eine Finite-Elemente-Methode vor, bei der zwei unabhängig voneinander verfeinerte Gitter genutzt werden können um ein System partieller Differentialgleichungen zu lösen

    Robust exact and inexact FETI-DP methods with applications to elasticity

    Gebietszerlegungsverfahren sind parallele, iterative Lösungsverfahren für grosse Gleichungssysteme, die bei der Diskretisierung von partiellen Differentialgleichungen, etwa aus der Strukturmechanik, entstehen. In dieser Arbeit werden duale, iterative Substrukturierungsverfahren vom FETI-DP-Typ (Finite Element Tearing and Interconnecting Dual-Primal) entwickelt und auf elliptische partielle Differentialgleichungen zweiter Ordnung angewandt. Insbesondere wird versucht, robuste Verfahren für homogene und heterogene Elastizitaetsprobleme zu entwickeln. Ebenso werden neue, inexakte FETI-DP-Verfahren vorgestellt, die eine inexakte Lösung des Grobgitterproblems und/oder der Teilgebietsprobleme erlauben. Es wird gezeigt, dass die neuen Algorithmen unter bestimmten Voraussetzungen Abschätzungen der gleichen asymptotischen Güte wie das klassische, exakte FETI-DP-Verfahren erfüllen. Parallele Resultate unter Verwendung von algebraischen Mehrgitter für das Grobgitterproblem zeigen die verbesserte Skalierbarkeit der neuen Algorithmen.Domain decomposition methods are fast parallel solvers for large equation systems arising from the discretisation of partial differential equations, e.g. from structural mechanics. In this work, dual iterative substructuring methods of the FETI-DP (Finite Element Tearing and Interconnecting Dual-Primal) type are developed and applied to second order elliptic problems with emphasis on elasticity. An attempt is made to develop robust methods for homogeneous and heterogeneous problems. New inexact FETI-DP methods are also introduced that allow for inexact coarse problem solvers and/or inexact subdomain solvers. It is shown that under certain conditions the new algorithms fulfill the same asymptotic condition number estimate as the traditional, exact FETI-DP methods. Parallel results using algebraic multigrid for the FETI-DP coarse problem show the improved scalability of the new algorithms

    Adaptive FETI-DP and BDDC methods for highly heterogeneous elliptic finite element problems in three dimensions

    Numerical methods are often well-suited for the solution of (elliptic) partial differential equations (PDEs) modeling naturally occuring processes. Many different solvers can be applied to systems which are obtained after discretization by the finite element method. Parallel architectures in modern computers facilitate the efficient use of diverse divide and conquer strategies. The intuitive approach, to divide a large (global) problem into subproblems, which are then solved in parallel, can significantly reduce the solution time. It is obvious that the solvers on the local subproblems then should deliver the contributions of the global solution restricted to the subdomains of computational region. The class of domain decomposition methods provides widely-used iterative algorithms for the parallel solution of implicit finite element problems. Often, an additional coarse space, which introduces a coupling between the subdomains, is used to ensure a global transport of information between the subdomains across the entire domain. The FETI-DP and BDDC domain decomposition methods are highly scalable parallel algorithms. However, when the parameter or coefficient distribution in the underlying partial differential equation becomes highly heterogeneous, classical methods, with a priori chosen coarse spaces, might not converge in a limited number of iterations. A remedy is offered by problem-dependent coarse spaces. These coarse spaces can be provided by adaptive methods, which then can improve the convergence at the cost of additional constraints. In this thesis, we introduce robust FETI-DP and BDDC methods for three-dimensional problems. These methods incorporate constraints, which are computed from local eigenvalue problems on faces and edges between subdomains, into the coarse space. The implementation of the constraints is performed by a deflation or balancing approach or by partial finite element assembly after a transformation of basis. For the latter, we introduce the generalized transformation-of-basis approach and show its correspondence to a deflation or balancing approach. An efficient parallel implementation of adaptive FETI-DP is discussed in the last part of this thesis. We provide weak and strong parallel scalability results for our adaptive algorithm executed on the supercomputer magnitUDE of the University of Duisburg-Essen. For weak scaling, we can show very good results up to 4,096 cores. We can also present very good strong scaling results up to 864 cores

    A Frugal FETI-DP and BDDC Coarse Space for Heterogeneous Problems

    The convergence rate of domain decomposition methods is generally determined by the eigenvalues of the preconditioned system. For second-order elliptic partial differential equations, coefficient discontinuities with a large contrast can lead to a deterioration of the convergence rate. Only by implementing an appropriate coarse space or second level, a robust domain decomposition method can be obtained. In this article, a new frugal coarse space for FETI-DP (Finite Element Tearing and Interconnecting - Dual Primal) and BDDC (Balancing Domain Decomposition by Constraints) methods is presented, which has a lower set-up cost than competing adaptive coarse spaces. In particular, in contrast to adaptive coarse spaces, it does not require the solution of any local generalized eigenvalue problems. The approach considered here aims at a low-dimensional approximation of the adaptive coarse space by using appropriate weighted averages and is robust for a broad range of coefficient distributions for diffusion and elasticity problems. In this article, the robustness is heuristically justified as well as numerically shown for several coefficient distributions. The new coarse space is compared to adaptive coarse spaces, and parallel scalability up to 262,144 parallel cores for a parallel BDDC implementation with the new coarse space is shown. The superiority of the new coarse space over classic coarse spaces with respect to parallel weak scalability and time to solution is confirmed by numerical experiments