Search CORE

1,056 research outputs found

Multilevel communication optimal LU and QR factorizations for hierarchical platforms

Author: Grigori Laura
Jacquelin Mathias
Khabou Amal
Publication venue
Publication date: 13/03/2013
Field of study

This study focuses on the performance of two classical dense linear algebra algorithms, the LU and the QR factorizations, on multilevel hierarchical platforms. We first introduce a new model called Hierarchical Cluster Platform (HCP), encapsulating the characteristics of such platforms. The focus is set on reducing the communication requirements of studied algorithms at each level of the hierarchy. Lower bounds on communications are therefore extended with respect to the HCP model. We then introduce multilevel LU and QR algorithms tailored for those platforms, and provide a detailed performance analysis. We also provide a set of numerical experiments and performance predictions demonstrating the need for such algorithms on large platforms

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

MIMS EPrints

Recommended from our members

Preparing sparse solvers for exascale computing.

Author: Anzt Hartwig
Boman Erik
Curfman McInnes Lois
Falgout Rob
Ghysels Pieter
Heroux Michael
Li Xiaoye
Meier Yang Ulrike
Rajamanickam Sivasankaran
Rupp Karl
Smith Barry
Tran Mills Richard
Yamazaki Ichitaro
Publication venue: eScholarship, University of California
Publication date: 01/03/2020
Field of study

Sparse solvers provide essential functionality for a wide variety of scientific applications. Highly parallel sparse solvers are essential for continuing advances in high-fidelity, multi-physics and multi-scale simulations, especially as we target exascale platforms. This paper describes the challenges, strategies and progress of the US Department of Energy Exascale Computing project towards providing sparse solvers for exascale computing platforms. We address the demands of systems with thousands of high-performance node devices where exposing concurrency, hiding latency and creating alternative algorithms become essential. The efforts described here are works in progress, highlighting current success and upcoming challenges. This article is part of a discussion meeting issue 'Numerical algorithms for high-performance computational science'

eScholarship - University of California

An MPI-OpenMP Hybrid Parallel H

Author: Han Guo
Jun Hu
Zaiping Nie
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2015
Field of study

In this paper we propose a high performance parallel strategy/technique to implement the fast direct solver based on hierarchical matrices method. Our goal is to directly solve electromagnetic integral equations involving electric-large and geometrical-complex targets, which are traditionally difficult to be solved by iterative methods. The parallel method of our direct solver features both OpenMP shared memory programming and MPl message passing for running on a computer cluster. With modifications to the core direct-solving algorithm of hierarchical LU factorization, the new fast solver is scalable for parallelized implementation despite of its sequential nature. The numerical experiments demonstrate the accuracy and efficiency of the proposed parallel direct solver for analyzing electromagnetic scattering problems of complex 3D objects with nearly 4 million unknowns

Crossref

Directory of Open Access Journals

Lecture 13: A low-rank factorization framework for building scalable algebraic solvers and preconditioners

Author: Li X. Sherry
Publication venue: ScholarWorks@UARK
Publication date: 08/04/2021
Field of study

Factorization based preconditioning algorithms, most notably incomplete LU (ILU) factorization, have been shown to be robust and applicable to wide ranges of problems. However, traditional ILU algorithms are not amenable to scalable implementation. In recent years, we have seen a lot of investigations using low-rank compression techniques to build approximate factorizations.A key to achieving lower complexity is the use of hierarchical matrix algebra, stemming from the H-matrix research. In addition, the multilevel algorithm paradigm provides a good vehicle for a scalable implementation. The goal of this lecture is to give an overview of the various hierarchical matrix formats, such as hierarchically semi-separable matrix (HSS), hierarchically off-diagonal low-rank matrix (HODLR) and Butterfly matrix, and explain the algorithm differences and approximation quality. We will illustrate many practical issues of these algorithms using our parallel libraries STRUMPACK and ButterflyPACK, and demonstrate their effectiveness and scalability while solving the very challenging problems, such as high frequency wave equations

UARK (University of Arkansas )

Lecture 13: A low-rank factorization framework for building scalable algebraic solvers and preconditioners

Author: Li X. Sherry
Publication venue: ScholarWorks@UARK
Publication date: 08/04/2021
Field of study

ScholarWorks@UARK

Book of Abstracts of the Sixth SIAM Workshop on Combinatorial Scientific Computing

Author: Uçar Bora
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/08/2014
Field of study

Book of Abstracts of CSC14 edited by Bora UçarInternational audienceThe Sixth SIAM Workshop on Combinatorial Scientific Computing, CSC14, was organized at the Ecole Normale Supérieure de Lyon, France on 21st to 23rd July, 2014. This two and a half day event marked the sixth in a series that started ten years ago in San Francisco, USA. The CSC14 Workshop's focus was on combinatorial mathematics and algorithms in high performance computing, broadly interpreted. The workshop featured three invited talks, 27 contributed talks and eight poster presentations. All three invited talks were focused on two interesting fields of research specifically: randomized algorithms for numerical linear algebra and network analysis. The contributed talks and the posters targeted modeling, analysis, bisection, clustering, and partitioning of graphs, applied in the context of networks, sparse matrix factorizations, iterative solvers, fast multi-pole methods, automatic differentiation, high-performance computing, and linear programming. The workshop was held at the premises of the LIP laboratory of ENS Lyon and was generously supported by the LABEX MILYON (ANR-10-LABX-0070, Université de Lyon, within the program ''Investissements d'Avenir'' ANR-11-IDEX-0007 operated by the French National Research Agency), and by SIAM

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

α-GMRES: A New Parallelizable AIterative Solver for Large Sparse Nonsymmetric Linear Systems Arising from CFD. G.U. Aero Report 9110

Author: Qin N.
Richards B.E.
Xu X.
Publication venue: Department of Aerospace Engineering, University of Glasgow
Publication date: 01/08/1991
Field of study

Linearization of the non-linear systems arising from fully implicit schemes in computational fluid dynamics often result in a large sparse non-symmetric linear system. Practical experience shows that these linear systems are ill-conditioned if a higher than first order spatial discretization scheme is used. To solve these linear systems, an efficient multilevel iterative method, the a-GMRES method, is proposed which incorporates a diagonal preconditioning with a damping factor a so that a balanced fast convergence of the inner GMRES iteration and the outer damping loop can be achieved. With this simple and efficient preconditioning and damping of the matrix, the resulting method can be effectively parallelized. The parallelization maintains the effectiveness of the original scheme due to the algorithm equivalence of the sequential and the parallel versions

Enlighten

A distributed-memory package for dense Hierarchically Semi-Separable matrix computations using randomization

Author: Ghysels Pieter
Li Xiaoye S.
Napov Artem
Rouet François-Henry
Publication venue
Publication date: 26/06/2015
Field of study

We present a distributed-memory library for computations with dense structured matrices. A matrix is considered structured if its off-diagonal blocks can be approximated by a rank-deficient matrix with low numerical rank. Here, we use Hierarchically Semi-Separable representations (HSS). Such matrices appear in many applications, e.g., finite element methods, boundary element methods, etc. Exploiting this structure allows for fast solution of linear systems and/or fast computation of matrix-vector products, which are the two main building blocks of matrix computations. The compression algorithm that we use, that computes the HSS form of an input dense matrix, relies on randomized sampling with a novel adaptive sampling mechanism. We discuss the parallelization of this algorithm and also present the parallelization of structured matrix-vector product, structured factorization and solution routines. The efficiency of the approach is demonstrated on large problems from different academic and industrial applications, on up to 8,000 cores. This work is part of a more global effort, the STRUMPACK (STRUctured Matrices PACKage) software package for computations with sparse and dense structured matrices. Hence, although useful on their own right, the routines also represent a step in the direction of a distributed-memory sparse solver

arXiv.org e-Print Archive

Crossref

eScholarship - University of California

DI-fusion