961 research outputs found
Computational complexity and memory usage for multi-frontal direct solvers used in p finite element analysis
The multi-frontal direct solver is the state of the art for the direct solution of linear systems. This paper provides computational complexity and memory usage estimates for the application of the multi-frontal direct solver algorithm on linear systems resulting from p finite elements. Specifically we provide the estimates for systems resulting from C0 polynomial spaces spanned by B-splines. The structured grid and uniform polynomial order used in isogeometric meshes simplifies the analysis. © 2011 Published by Elsevier Ltd
Computational complexity and memory usage for multi-frontal direct solvers in structured mesh finite elements
The multi-frontal direct solver is the state-of-the-art algorithm for the
direct solution of sparse linear systems. This paper provides computational
complexity and memory usage estimates for the application of the multi-frontal
direct solver algorithm on linear systems resulting from B-spline-based
isogeometric finite elements, where the mesh is a structured grid. Specifically
we provide the estimates for systems resulting from polynomial
B-spline spaces and compare them to those obtained using spaces.Comment: 8 pages, 2 figure
An efficient multi-core implementation of a novel HSS-structured multifrontal solver using randomized sampling
We present a sparse linear system solver that is based on a multifrontal
variant of Gaussian elimination, and exploits low-rank approximation of the
resulting dense frontal matrices. We use hierarchically semiseparable (HSS)
matrices, which have low-rank off-diagonal blocks, to approximate the frontal
matrices. For HSS matrix construction, a randomized sampling algorithm is used
together with interpolative decompositions. The combination of the randomized
compression with a fast ULV HSS factorization leads to a solver with lower
computational complexity than the standard multifrontal method for many
applications, resulting in speedups up to 7 fold for problems in our test
suite. The implementation targets many-core systems by using task parallelism
with dynamic runtime scheduling. Numerical experiments show performance
improvements over state-of-the-art sparse direct solvers. The implementation
achieves high performance and good scalability on a range of modern shared
memory parallel systems, including the Intel Xeon Phi (MIC). The code is part
of a software package called STRUMPACK -- STRUctured Matrices PACKage, which
also has a distributed memory component for dense rank-structured matrices
Applications of a hyper-graph grammar system in adaptive finite-element computations
This paper describes application of a hyper-graph grammar system for modeling a three-dimensional adaptive finite element method. The hyper-graph grammar approach allows obtaining a linear computational cost of adaptive mesh transformations and computations performed over refined meshes. The computations are done by a hyper-graph grammar driven algorithm applicable to three-dimensional problems. For the case of typical refinements performed towards a point or an edge, the algorithm yields linear computational cost with respect to the mesh nodes for its sequential execution and logarithmic cost for its parallel execution. Such hyper-graph grammar productions are the mathematical formalism used to describe the computational algorithm implementing the finite element method. Each production indicates the smallest atomic task that can be executed concurrently. The mesh transformations and computations by using the hyper-graph grammar-based approach have been tested in the GALOIS environment. We conclude the paper with some numerical results performed on a shared-memory Linux cluster node, for the case of three-dimensional computational meshes refined towards a point, an edge and a face
Recommended from our members
Preparing sparse solvers for exascale computing.
Sparse solvers provide essential functionality for a wide variety of scientific applications. Highly parallel sparse solvers are essential for continuing advances in high-fidelity, multi-physics and multi-scale simulations, especially as we target exascale platforms. This paper describes the challenges, strategies and progress of the US Department of Energy Exascale Computing project towards providing sparse solvers for exascale computing platforms. We address the demands of systems with thousands of high-performance node devices where exposing concurrency, hiding latency and creating alternative algorithms become essential. The efforts described here are works in progress, highlighting current success and upcoming challenges. This article is part of a discussion meeting issue 'Numerical algorithms for high-performance computational science'
Parallel Fast Isogeometric Solvers for Explicit Dynamics
This paper presents a parallel implementation of the fast isogeometric solvers for explicit dynamics for solving non-stationary time-dependent problems. The algorithm is described in pseudo-code. We present theoretical estimates of the computational and communication complexities for a single time step of the parallel algorithm. The computational complexity is O(p^6 N/c t_comp) and communication complexity is O(N/(c^(2/3)t_comm) where p denotes the polynomial order of B-spline basis with Cp-1 global continuity, N denotes the number of elements and c is number of processors forming a cube, t_comp refers to the execution time of a single operation, and t_comm refers to the time of sending a single datum. We compare theoretical estimates with numerical experiments performed on the LONESTAR Linux cluster from Texas Advanced Computing Center, using 1 000 processors. We apply the method to solve nonlinear flows in highly heterogeneous porous media
Heuristic algorithm to predict the location of C^{0} separators for efficient isogeometric analysis simulations with direct solvers
We focus on two and three-dimensional isogeometric finite element method computations with tensor product Ck B-spline basis functions. We consider the computational cost of the multi-frontal direct solver algorithm executed over such tensor product grids. We present an algorithm for estimation of the number of floating-point operations per mesh node resulting from the execution of the multi-frontal solver algorithm with the ordering obtained from the element partition trees. Next, we propose an algorithm that introduces C0 separators between patches of elements of a given size based on the stimated number of flops per node. We show that the computational cost of the multi-frontal solver algorithm executed over the computational grids with C0 separators introduced is around one or two orders of magnitude lower, while the approximability of the functional space is improved. We show O(NlogN) computational complexity of the heuristic algorithm proposing the introduction of the C0 separators between the patches of elements, reducing the computational cost of the multi-frontal solver algorithm
- …