86 research outputs found
Classical and all-floating FETI methods for the simulation of arterial tissues
High-resolution and anatomically realistic computer models of biological soft
tissues play a significant role in the understanding of the function of
cardiovascular components in health and disease. However, the computational
effort to handle fine grids to resolve the geometries as well as sophisticated
tissue models is very challenging. One possibility to derive a strongly
scalable parallel solution algorithm is to consider finite element tearing and
interconnecting (FETI) methods. In this study we propose and investigate the
application of FETI methods to simulate the elastic behavior of biological soft
tissues. As one particular example we choose the artery which is - as most
other biological tissues - characterized by anisotropic and nonlinear material
properties. We compare two specific approaches of FETI methods, classical and
all-floating, and investigate the numerical behavior of different
preconditioning techniques. In comparison to classical FETI, the all-floating
approach has not only advantages concerning the implementation but in many
cases also concerning the convergence of the global iterative solution method.
This behavior is illustrated with numerical examples. We present results of
linear elastic simulations to show convergence rates, as expected from the
theory, and results from the more sophisticated nonlinear case where we apply a
well-known anisotropic model to the realistic geometry of an artery. Although
the FETI methods have a great applicability on artery simulations we will also
discuss some limitations concerning the dependence on material parameters.Comment: 29 page
MERIC and RADAR generator: tools for energy evaluation and runtime tuning of HPC applications
This paper introduces two tools for manual energy evaluation and runtime tuning developed at IT4Innovations in the READEX project. The MERIC library can be used for manual instrumentation and analysis of any application from the energy and time consumption point of view. Besides tracing, MERIC can also change environment and hardware parameters during the application runtime, which leads to energy savings.
MERIC stores large amounts of data, which are difficult to read by a human. The RADAR generator analyses the MERIC output files to find the best settings of evaluated parameters for each instrumented region. It generates a Open image in new window report and a MERIC configuration file for application production runs
Recommended from our members
A Parallel Direct Method for Finite Element Electromagnetic Computations Based on Domain Decomposition
High performance parallel computing and direct (factorization-based) solution methods have been the two main trends in electromagnetic computations in recent years. When time-harmonic (frequency-domain) Maxwell\u27s equation are directly discretized with the Finite Element Method (FEM) or other Partial Differential Equation (PDE) methods, the resulting linear system of equations is sparse and indefinite, thus harder to efficiently factorize serially or in parallel than alternative methods e.g. integral equation solutions, that result in dense linear systems. State-of-the-art sparse matrix direct solvers such as MUMPS and PARDISO don\u27t scale favorably, have low parallel efficiency and high memory footprint. This work introduces a new class of sparse direct solvers based on domain decomposition method, termed Direct Domain Decomposition Method (D3M), which is reliable, memory efficient, and offers very good parallel scalability for arbitrary 3D FEM problems.
Unlike recent trends in approximate/low-rank solvers, this method focuses on `numerically exact\u27 solution methods as they are more reliable for complex `real-life\u27 models. The proposed method leverages physical insights at every stage of the development through a new symmetric domain decomposition method (DDM) with one set of Lagrange multipliers. Applying a special regularization scheme at the interfaces, either artificial loss or gain is introduced to each domain to eliminate non-physical internal resonances. A block-wise recursive algorithm based on Takahashi relationship is proposed for the efficient computation of discrete Dirichlet-to-Neumann (DtN) map to reduce the volumetric problem from all domains into an auxiliary surfacial problem defined on the domain interfaces only. Numerical results show up to 50% run-time saving in DtN map computation using the proposed block-wise recursive algorithm compared to alternative approaches. The auxiliary unknowns on the domain interfaces form a considerably (approximately an order of magnitude) smaller block-wise sparse matrix, which is efficiently factorized using a customized block LDL factorization with restricted pivoting to ensure stability.
The parallelization of the proposed D3M is realized based on Directed Acyclic Graph (DAG). Recent advances in parallel dense direct solvers, have shifted toward parallel implementation that rely on DAG scheduling to achieve highly efficient asynchronous parallel execution. However, adaptation of such schemes to sparse matrices is harder and often impractical. In D3M, computation of each domain\u27s discrete DtN map ``embarrassingly parallel\u27\u27, whereas the customized block LDLT is suitable for a block directed acyclic graph (B-DAG) task scheduling, similar to that used in dense matrix parallel direct solvers. In this approach, computations are represented as a sequence of small tasks that operate on domains of DDM or dense matrix blocks of the reduced matrix. These tasks can be statically scheduled for parallel execution using their DAG dependencies and weights that depend on estimates of computation and communication costs.
Comparisons with state-of-the-art exact direct solvers on electrically large problems suggest up to 20% better parallel efficiency, 30% - 3X less memory and slightly faster in runtime, while maintaining the same accuracy
Recommended from our members
Randomized Computations for Efficient and Robust Finite Element Domain Decomposition Methods in Electromagnetics
Numerical modeling of electromagnetic (EM) phenomenon has proved to become an effective and efficient tool in design and optimization of modern electronic devices, integrated circuits (IC) and RF systems. However the generality, efficiency and reliability/resilience of the computational EM solver is often criticised due to the fact that the underlying characteristics of the simulated problems are usually different, which makes the development of a general, \u27\u27black-box\u27\u27 EM solver to be a difficult task.
In this work, we aim to propose a reliable/resilient, scalable and efficient finite elements based domain decomposition method (FE-DDM) as a general CEM solver to tackle such ultimate CEM problems to some extent. We recognize the rank deficiency property of the Dirichlet-to-Neumann (DtN) operators involved in the previously proposed FETI-2 DDM formulation and apply such principle to improve the computational efficiency and robustness of FETI-2 DDM. Specifically, the rank deficient DtN operator is computed by a randomized computation method that was originally proposed to approximate matrix singular value decomposition (SVD). Numerical results show a up to 35\% run-time and 75% memory saving of the DtN operators computation can be achieved on a realistic example. Later, such rank deficiency principle is incorporated into a new global DDM preconditioner (W-FETI) that is inspired by the matrix Woodbury identity. Numerical study of the eigenspectrum shows the validity of the proposed W-FETI global preconditioner. Several industrial-scaled examples show significant iterative convergence advantage of W-FETI that uses 35%-80% matrix-vector-products (MxVs) than state-of-the-art DDM solvers
Combining Machine Learning and Domain Decomposition Methods – A Review
Scientific machine learning, an area of research where techniques from machine learning and scientific computing are combined, has become of increasing importance and receives growing attention. Here, our focus is on a very specific area within scientific machine learning given by the combination of domain decomposition methods with machine learning techniques. The aim of the present work is to make an attempt of providing a review of existing and also new approaches within this field as well as to present some known results in a unified framework; no claim of completeness is made. As a concrete example of machine learning enhanced domain decomposition methods, an approach is presented which uses neural networks to reduce the computational effort in adaptive domain decomposition methods while retaining their robustness. More precisely, deep neural networks are used to predict the geometric location of constraints which are needed to define a robust coarse space. Additionally, two recently published deep domain decomposition approaches are presented in a unified framework. Both approaches use physics-constrained neural networks to replace the discretization and solution of the subdomain problems of a given decomposition of the computational domain. Finally, a brief overview is given of several further approaches which combine machine learning with ideas from domain decomposition methods to either increase the performance of already existing algorithms or to create completely new methods
Software for Exascale Computing - SPPEXA 2016-2019
This open access book summarizes the research done and results obtained in the second funding phase of the Priority Program 1648 "Software for Exascale Computing" (SPPEXA) of the German Research Foundation (DFG) presented at the SPPEXA Symposium in Dresden during October 21-23, 2019. In that respect, it both represents a continuation of Vol. 113 in Springer’s series Lecture Notes in Computational Science and Engineering, the corresponding report of SPPEXA’s first funding phase, and provides an overview of SPPEXA’s contributions towards exascale computing in today's sumpercomputer technology. The individual chapters address one or more of the research directions (1) computational algorithms, (2) system software, (3) application software, (4) data management and exploration, (5) programming, and (6) software tools. The book has an interdisciplinary appeal: scholars from computational sub-fields in computer science, mathematics, physics, or engineering will find it of particular interest
Parallel computation in efficient non-linear finite element analysis with applications to soft-ground tunneling project
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Civil and Environmental Engineering, 2004.Includes bibliographical references.Reliable prediction and control of ground movements represent an essential component of underground construction projects in congested urban environments, to mitigate against possible damage to adjacent structures and utilities. This research was motivated by the construction of a large underground cavern for the Rio Piedras station in San Juan, Puerto Rico. This project involved the construction of a large, horseshoe-shaped cavern (17m wide and 16m high) in weathered alluvial soils. The crown of the cavern is located less than 5.5m below existing buildings in a busy commercial district. Structural support for the cavern was provided by a series of 15 stacked drifts. These 3m square-section galleries were excavated mainly by hand and in-filled with concrete, while a compensation grouting system was designed to mitigate effects of excavation-induced ground movements on the overlying structures. Unexpectedly large settlements occurred during drift construction and overwhelmed the grouting system that was intended to compensate for tunnel-induced movements. Although two dimensional, non-linear finite element analyses of the stacked- drift construction suggest that movements exceeding 100mm can be expected, the 2-D representation of excavation and ground support is overly simplistic and represents a major source of uncertainty in these analyses. Massive computational efforts make more comprehensive 3-D models of the construction sequence completely impractical using existing finite element software with direct or iterative solver methods.(cont.) This thesis develops, implements, and applies an efficient parallel computation scheme for solving such large-scale, non-linear finite element analyses. The analyses couple a non- overlapping Domain Decomposition technique known as the FETI algorithm (Farhat & Roux, 1991) with a Newton-Raphson iteration scheme for non-linear material behavior. This method uses direct factorization of the equilibrium equations for sub-domains, while solving a separate interface problem iteratively with a mechanically consistent, Dirichlet pre- conditioner. The implementation allows independence of the number of sub-domains from the number of processors. This provides flexibility on mesh decomposition, control between iterative interface solutions and direct sub-domain solutions, and load balance in shared heterogeneous clusters. The analyses are performed with the developed code, FETI- FEM (programmed in C++ and MPI) using syntax consistent with pre-existing ABAQUS software. Benchmark testing on a Beowulf cluster of 16 interconnected commodity PC computers found excellent parallel efficiency, while the computation time scales with the number of finite elements, NE, according to a power law with exponent, p = 1.217. Parallel 3-D FE analyses have been applied in modeling the drift excavation, primary lining and infilling for the stacked-drift construction assuming a simplified soil profile. The resulting FE model comprised approximately 30,000 20-noded quadratic displacement-based elements, representing almost 400,000 degrees of freedom (at least one order of magnitude larger than any prior model reported in the geotechnical literature) and was sub-divided into 168 sub-domains ...by Yo-Ming Hsieh.Ph.D
Robust exact and inexact FETI-DP methods with applications to elasticity
Gebietszerlegungsverfahren sind parallele, iterative Lösungsverfahren
für grosse Gleichungssysteme, die bei der Diskretisierung von partiellen Differentialgleichungen, etwa aus der Strukturmechanik, entstehen. In dieser Arbeit werden duale, iterative Substrukturierungsverfahren vom FETI-DP-Typ (Finite Element Tearing and Interconnecting Dual-Primal) entwickelt und auf elliptische partielle Differentialgleichungen zweiter Ordnung angewandt. Insbesondere wird versucht, robuste Verfahren für homogene und heterogene Elastizitaetsprobleme zu entwickeln. Ebenso werden neue, inexakte FETI-DP-Verfahren vorgestellt, die eine inexakte Lösung des Grobgitterproblems und/oder der Teilgebietsprobleme erlauben. Es wird gezeigt, dass die neuen Algorithmen unter bestimmten Voraussetzungen Abschätzungen der gleichen asymptotischen Güte wie das klassische, exakte FETI-DP-Verfahren erfüllen. Parallele Resultate unter
Verwendung von algebraischen Mehrgitter für das Grobgitterproblem
zeigen die verbesserte Skalierbarkeit der neuen Algorithmen.Domain decomposition methods are fast parallel solvers for large equation systems arising from the discretisation of partial differential equations, e.g. from structural mechanics. In this work, dual iterative substructuring methods of the FETI-DP (Finite Element Tearing and Interconnecting Dual-Primal) type are developed and applied to second order elliptic problems with emphasis on elasticity. An attempt is made to develop robust methods for homogeneous and heterogeneous problems. New inexact FETI-DP methods are also introduced that allow for inexact coarse problem solvers and/or inexact subdomain solvers. It is shown that under certain conditions the new algorithms fulfill the same asymptotic condition number estimate as the traditional, exact FETI-DP methods. Parallel results using algebraic multigrid for the FETI-DP coarse problem show the improved scalability of the new algorithms
- …