91,536 research outputs found
Domain Decomposition Based High Performance Parallel Computing\ud
The study deals with the parallelization of finite element based Navier-Stokes codes using domain decomposition and state-ofart sparse direct solvers. There has been significant improvement in the performance of sparse direct solvers. Parallel sparse direct solvers are not found to exhibit good scalability. Hence, the parallelization of sparse direct solvers is done using domain decomposition techniques. A highly efficient sparse direct solver PARDISO is used in this study. The scalability of both Newton and modified Newton algorithms are tested
Power System Dynamic Simulations Using a Parallel Two-Level Schur-Complement Decomposition
As the need for faster power system dynamic simulations increases, it is essential to develop new algorithms that exploit parallel computing to accelerate those simulations. This paper proposes a parallel algorithm based on a two-level, Schur-complement-based, domain decomposition method. The two-level partitioning provides high parallelization potential (coarse- and fine-grained). In addition, due to the Schur-complement approach used to update the sub-domain interface variables, the algorithm exhibits high global convergence rate. Finally, it provides significant numerical and computational acceleration. The algorithm is implemented using the shared-memory parallel programming model, targeting inexpensive multi-core machines. Its performance is reported on a real system as well as on a large test system combining transmission and distribution networks
Parallel algebraic domain decomposition solver for the solution of augmented systems
International audienceWe consider the parallel iterative solution of indefinite linear systems given as augmented systems. Our numerical technique is based on an algebraic non overlapping domain decomposition technique that only exploits the graph of the sparse matrix. This approach to high-performance, scalable solution of large sparse linear systems in parallel scientific computing, is to combine direct and iterative methods. We report numerical and parallel performance of the scheme on large matrices arising from the finite element discretization of linear elasticity in structural mechanics problems
The LifeV library: engineering mathematics beyond the proof of concept
LifeV is a library for the finite element (FE) solution of partial
differential equations in one, two, and three dimensions. It is written in C++
and designed to run on diverse parallel architectures, including cloud and high
performance computing facilities. In spite of its academic research nature,
meaning a library for the development and testing of new methods, one
distinguishing feature of LifeV is its use on real world problems and it is
intended to provide a tool for many engineering applications. It has been
actually used in computational hemodynamics, including cardiac mechanics and
fluid-structure interaction problems, in porous media, ice sheets dynamics for
both forward and inverse problems. In this paper we give a short overview of
the features of LifeV and its coding paradigms on simple problems. The main
focus is on the parallel environment which is mainly driven by domain
decomposition methods and based on external libraries such as MPI, the Trilinos
project, HDF5 and ParMetis.
Dedicated to the memory of Fausto Saleri.Comment: Review of the LifeV Finite Element librar
Large scale simulation of turbulence using a hybrid spectral/finite difference solver
Performing Direct Numerical Simulation (DNS) of turbulence on large-scale systems (offering more than 1024 cores) has become a challenge in high performance computing. The computer power increase allows now to solve flow problems on
large grids (with close to 10^9 nodes). Moreover these large scale simulations can be performed on non-homogeneous turbulent flows. A reasonable amount of time is needed to converge statistics if the large grid size is combined with a large number of cores. To this end we developed a Navier-Stokes solver, dedicated to situations where only one direction is heterogeneous, and particularly suitable for massive parallel architecture. Based on an hybrid approach spectral/finite-difference, we use a volumetric decomposition of the domain to extend the FFTs computation to a large number of cores. Scalability tests using up to 32K cores as well as preliminary results of a full simulation are presented
Domain Decomposition Methods in Optimal Flow Control for High Performance Computing
This thesis is concerned with linear and non-linear optimal flow control problems which are modeled by systems of partial differential equations. The numerical treatment of such problems, especially in the context of flow problems, is often very expensive and challenging. To tackle this complexity, we present parallel approaches based on non-overlapping domain decomposition methods that exploit the computational power provided by modern high performance computing technologies
Hybrid parallelization of an adaptive finite element code
summary:We present a hybrid OpenMP/MPI parallelization of the finite element method that is suitable to make use of modern high performance computers. These are usually built from a large bulk of multi-core systems connected by a fast network. Our parallelization method is based firstly on domain decomposition to divide the large problem into small chunks. Each of them is then solved on a multi-core system using parallel assembling, solution and error estimation. To make domain decomposition for both, the large problem and the smaller sub-problems, sufficiently fast we make use of a hierarchical mesh structure. The partitioning is done on a coarser mesh level, resulting in a very fast method that shows good computational balancing results. Numerical experiments show that both parallelization methods achieve good scalability in computing solution of nonlinear, time dependent, higher order PDEs on large domains. The parallelization is realized in the adaptive finite element software AMDiS
Performance and results of the high-resolution biogeochemical model PELAGOS025 v1.0 within NEMO v3.4
Abstract. The present work aims at evaluating the scalability performance of a high-resolution global ocean biogeochemistry model (PELAGOS025) on massive parallel architectures and the benefits in terms of the time-to-solution reduction. PELAGOS025 is an on-line coupling between the Nucleus for the European Modelling of the Ocean (NEMO) physical ocean model and the Biogeochemical Flux Model (BFM) biogeochemical model. Both the models use a parallel domain decomposition along the horizontal dimension. The parallelisation is based on the message passing paradigm. The performance analysis has been done on two parallel architectures, an IBM BlueGene/Q at ALCF (Argonne Leadership Computing Facilities) and an IBM iDataPlex with Sandy Bridge processors at the CMCC (Euro Mediterranean Center on Climate Change). The outcome of the analysis demonstrated that the lack of scalability is due to several factors such as the I/O operations, the memory contention, the load unbalancing due to the memory structure of the BFM component and, for the BlueGene/Q, the absence of a hybrid parallelisation approach
Recommended from our members
A Parallel Direct Method for Finite Element Electromagnetic Computations Based on Domain Decomposition
High performance parallel computing and direct (factorization-based) solution methods have been the two main trends in electromagnetic computations in recent years. When time-harmonic (frequency-domain) Maxwell\u27s equation are directly discretized with the Finite Element Method (FEM) or other Partial Differential Equation (PDE) methods, the resulting linear system of equations is sparse and indefinite, thus harder to efficiently factorize serially or in parallel than alternative methods e.g. integral equation solutions, that result in dense linear systems. State-of-the-art sparse matrix direct solvers such as MUMPS and PARDISO don\u27t scale favorably, have low parallel efficiency and high memory footprint. This work introduces a new class of sparse direct solvers based on domain decomposition method, termed Direct Domain Decomposition Method (D3M), which is reliable, memory efficient, and offers very good parallel scalability for arbitrary 3D FEM problems.
Unlike recent trends in approximate/low-rank solvers, this method focuses on `numerically exact\u27 solution methods as they are more reliable for complex `real-life\u27 models. The proposed method leverages physical insights at every stage of the development through a new symmetric domain decomposition method (DDM) with one set of Lagrange multipliers. Applying a special regularization scheme at the interfaces, either artificial loss or gain is introduced to each domain to eliminate non-physical internal resonances. A block-wise recursive algorithm based on Takahashi relationship is proposed for the efficient computation of discrete Dirichlet-to-Neumann (DtN) map to reduce the volumetric problem from all domains into an auxiliary surfacial problem defined on the domain interfaces only. Numerical results show up to 50% run-time saving in DtN map computation using the proposed block-wise recursive algorithm compared to alternative approaches. The auxiliary unknowns on the domain interfaces form a considerably (approximately an order of magnitude) smaller block-wise sparse matrix, which is efficiently factorized using a customized block LDL factorization with restricted pivoting to ensure stability.
The parallelization of the proposed D3M is realized based on Directed Acyclic Graph (DAG). Recent advances in parallel dense direct solvers, have shifted toward parallel implementation that rely on DAG scheduling to achieve highly efficient asynchronous parallel execution. However, adaptation of such schemes to sparse matrices is harder and often impractical. In D3M, computation of each domain\u27s discrete DtN map ``embarrassingly parallel\u27\u27, whereas the customized block LDLT is suitable for a block directed acyclic graph (B-DAG) task scheduling, similar to that used in dense matrix parallel direct solvers. In this approach, computations are represented as a sequence of small tasks that operate on domains of DDM or dense matrix blocks of the reduced matrix. These tasks can be statically scheduled for parallel execution using their DAG dependencies and weights that depend on estimates of computation and communication costs.
Comparisons with state-of-the-art exact direct solvers on electrically large problems suggest up to 20% better parallel efficiency, 30% - 3X less memory and slightly faster in runtime, while maintaining the same accuracy
RIACS
Topics considered include: high-performance computing; cognitive and perceptual prostheses (computational aids designed to leverage human abilities); autonomous systems. Also included: development of a 3D unstructured grid code based on a finite volume formulation and applied to the Navier-stokes equations; Cartesian grid methods for complex geometry; multigrid methods for solving elliptic problems on unstructured grids; algebraic non-overlapping domain decomposition methods for compressible fluid flow problems on unstructured meshes; numerical methods for the compressible navier-stokes equations with application to aerodynamic flows; research in aerodynamic shape optimization; S-HARP: a parallel dynamic spectral partitioner; numerical schemes for the Hamilton-Jacobi and level set equations on triangulated domains; application of high-order shock capturing schemes to direct simulation of turbulence; multicast technology; network testbeds; supercomputer consolidation project
- …