Search CORE

961 research outputs found

Computational complexity and memory usage for multi-frontal direct solvers used in p finite element analysis

Author: Amestoy
Amestoy
Cottrell
Duff
Duff
Hughes
Irons
Publication venue: 'Elsevier BV'
Publication date: 01/01/2011
Field of study

The multi-frontal direct solver is the state of the art for the direct solution of linear systems. This paper provides computational complexity and memory usage estimates for the application of the multi-frontal direct solver algorithm on linear systems resulting from p finite elements. Specifically we provide the estimates for systems resulting from C0 polynomial spaces spanned by B-splines. The structured grid and uniform polynomial order used in isogeometric meshes simplifies the analysis. © 2011 Published by Elsevier Ltd

Elsevier - Publisher Connector

Crossref

espace@Curtin

Computational complexity and memory usage for multi-frontal direct solvers in structured mesh finite elements

Author: Calo Victor M.
Collier Nathan
Pardo David
Paszynski Maciej
Publication venue
Publication date: 01/01/2012
Field of study

The multi-frontal direct solver is the state-of-the-art algorithm for the direct solution of sparse linear systems. This paper provides computational complexity and memory usage estimates for the application of the multi-frontal direct solver algorithm on linear systems resulting from B-spline-based isogeometric finite elements, where the mesh is a structured grid. Specifically we provide the estimates for systems resulting from

C^{p-1}

polynomial B-spline spaces and compare them to those obtained using

C^0

spaces.Comment: 8 pages, 2 figure

arXiv.org e-Print Archive

espace@Curtin

An efficient multi-core implementation of a novel HSS-structured multifrontal solver using randomized sampling

Author: Ghysels Pieter
Li Xiaoye S.
Napov Artem
Rouet Francois-Henry
Williams Samuel
Publication venue
Publication date: 25/02/2015
Field of study

We present a sparse linear system solver that is based on a multifrontal variant of Gaussian elimination, and exploits low-rank approximation of the resulting dense frontal matrices. We use hierarchically semiseparable (HSS) matrices, which have low-rank off-diagonal blocks, to approximate the frontal matrices. For HSS matrix construction, a randomized sampling algorithm is used together with interpolative decompositions. The combination of the randomized compression with a fast ULV HSS factorization leads to a solver with lower computational complexity than the standard multifrontal method for many applications, resulting in speedups up to 7 fold for problems in our test suite. The implementation targets many-core systems by using task parallelism with dynamic runtime scheduling. Numerical experiments show performance improvements over state-of-the-art sparse direct solvers. The implementation achieves high performance and good scalability on a range of modern shared memory parallel systems, including the Intel Xeon Phi (MIC). The code is part of a software package called STRUMPACK -- STRUctured Matrices PACKage, which also has a distributed memory component for dense rank-structured matrices

arXiv.org e-Print Archive

Crossref

eScholarship - University of California

DI-fusion

Applications of a hyper-graph grammar system in adaptive finite-element computations

Author: Gurgul Piotr
Jopek Konrad
Paszyńska Anna
Pingali Keshav
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2018
Field of study

This paper describes application of a hyper-graph grammar system for modeling a three-dimensional adaptive finite element method. The hyper-graph grammar approach allows obtaining a linear computational cost of adaptive mesh transformations and computations performed over refined meshes. The computations are done by a hyper-graph grammar driven algorithm applicable to three-dimensional problems. For the case of typical refinements performed towards a point or an edge, the algorithm yields linear computational cost with respect to the mesh nodes for its sequential execution and logarithmic cost for its parallel execution. Such hyper-graph grammar productions are the mathematical formalism used to describe the computational algorithm implementing the finite element method. Each production indicates the smallest atomic task that can be executed concurrently. The mesh transformations and computations by using the hyper-graph grammar-based approach have been tested in the GALOIS environment. We conclude the paper with some numerical results performed on a shared-memory Linux cluster node, for the case of three-dimensional computational meshes refined towards a point, an edge and a face

Crossref

Jagiellonian Univeristy Repository

Recommended from our members

Preparing sparse solvers for exascale computing.

Author: Anzt Hartwig
Boman Erik
Curfman McInnes Lois
Falgout Rob
Ghysels Pieter
Heroux Michael
Li Xiaoye
Meier Yang Ulrike
Rajamanickam Sivasankaran
Rupp Karl
Smith Barry
Tran Mills Richard
Yamazaki Ichitaro
Publication venue: eScholarship, University of California
Publication date: 01/03/2020
Field of study

Sparse solvers provide essential functionality for a wide variety of scientific applications. Highly parallel sparse solvers are essential for continuing advances in high-fidelity, multi-physics and multi-scale simulations, especially as we target exascale platforms. This paper describes the challenges, strategies and progress of the US Department of Energy Exascale Computing project towards providing sparse solvers for exascale computing platforms. We address the demands of systems with thousands of high-performance node devices where exposing concurrency, hiding latency and creating alternative algorithms become essential. The efforts described here are works in progress, highlighting current success and upcoming challenges. This article is part of a discussion meeting issue 'Numerical algorithms for high-performance computational science'

eScholarship - University of California

Parallel Fast Isogeometric Solvers for Explicit Dynamics

Author: Calo Victor Manuel
Dalcin Lisandro
Paszyński Maciej
Woźniak Maciej
Łoś Marcin
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 12/06/2017
Field of study

This paper presents a parallel implementation of the fast isogeometric solvers for explicit dynamics for solving non-stationary time-dependent problems. The algorithm is described in pseudo-code. We present theoretical estimates of the computational and communication complexities for a single time step of the parallel algorithm. The computational complexity is O(p^6 N/c t_comp) and communication complexity is O(N/(c^(2/3)t_comm) where p denotes the polynomial order of B-spline basis with Cp-1 global continuity, N denotes the number of elements and c is number of processors forming a cube, t_comp refers to the execution time of a single operation, and t_comm refers to the time of sending a single datum. We compare theoretical estimates with numerical experiments performed on the LONESTAR Linux cluster from Texas Advanced Computing Center, using 1 000 processors. We apply the method to solve nonlinear flows in highly heterogeneous porous media

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

Heuristic algorithm to predict the location of C^{0} separators for efficient isogeometric analysis simulations with direct solvers

Author: Jopek K.
Paszyńska Anna
Paszyński M.
Woźniak M.
Publication venue
Publication date: 01/01/2018
Field of study

We focus on two and three-dimensional isogeometric finite element method computations with tensor product Ck B-spline basis functions. We consider the computational cost of the multi-frontal direct solver algorithm executed over such tensor product grids. We present an algorithm for estimation of the number of floating-point operations per mesh node resulting from the execution of the multi-frontal solver algorithm with the ordering obtained from the element partition trees. Next, we propose an algorithm that introduces C0 separators between patches of elements of a given size based on the stimated number of flops per node. We show that the computational cost of the multi-frontal solver algorithm executed over the computational grids with C0 separators introduced is around one or two orders of magnitude lower, while the approximability of the functional space is improved. We show O(NlogN) computational complexity of the heuristic algorithm proposing the introduction of the C0 separators between the patches of elements, reducing the computational cost of the multi-frontal solver algorithm

Biblioteka Nauki - repozytorium artykuÅÃ³w

Jagiellonian Univeristy Repository