Search CORE

16,599 research outputs found

Parallel matrix inversion techniques

Author: Kumar M. J.
Lau K. K.
Venkatesh S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1996
Field of study

In this paper, we present techniques for inverting sparse, symmetric and positive definite matrices on parallel and distributed computers. We propose two algorithms, one for SIMD implementation and the other for MIMD implementation. These algorithms are modified versions of Gaussian elimination and they take into account the sparseness of the matrix. Our algorithms perform better than the general parallel Gaussian elimination algorithm. In order to demonstrate the usefulness of our technique, we implemented the snake problem using our sparse matrix algorithm. Our studies reveal that the proposed sparse matrix inversion algorithm significantly reduces the time taken for obtaining the solution of the snake problem. In this paper, we present the results of our experimental work

Deakin Research Online

Linearly scaling direct method for accurately inverting sparse banded matrices

Author: Alvarez-Estrada R F
Anderson E
Avogadro
Briley W R
Calvo G F
Gillan M
Golub G H
Guardiola R
Haddad O M Al-Nimr M A Shatnawi G H
Hager G
Hennessy J L
Leimkuhler B
Lumsdaine A White J Webber D Sangiovanni-Vincentelli A
Mazars M
Pablo Echenique
Pablo García-Risueño
Press W H
Soler J M
Wakins D S
Publication venue: 'IOP Publishing'
Publication date: 16/12/2011
Field of study

In many problems in Computational Physics and Chemistry, one finds a special kind of sparse matrices, termed "banded matrices". These matrices, which are defined as having non-zero entries only within a given distance from the main diagonal, need often to be inverted in order to solve the associated linear system of equations. In this work, we introduce a new O(n) algorithm for solving such a system, being n X n the size of the matrix. We produce the analytical recursive expressions that allow to directly obtain the solution, as well as the pseudocode for its computer implementation. Moreover, we review the different options for possibly parallelizing the method, we describe the extension to deal with matrices that are banded plus a small number of non-zero entries outside the band, and we use the same ideas to produce a method for obtaining the full inverse matrix. Finally, we show that the New Algorithm is competitive, both in accuracy and in numerical efficiency, when compared to a standard method based in Gaussian elimination. We do this using sets of large random banded matrices, as well as the ones that appear when one tries to solve the 1D Poisson equation by finite differences.Comment: 24 pages, 5 figures, submitted to J. Comp. Phy

arXiv.org e-Print Archive

Crossref

Digital.CSIC

Fast Scalable Construction of (Minimal Perfect Hash) Functions

Author: A Goerdt
AM Frieze
AM Odlyzko
BA LaMacchia
BS Majewski
D Belazzougui
D Belazzougui
D Belazzougui
D Belazzougui
FC Botelho
M Aumüller
M Dietzfelbinger
M Dietzfelbinger
N Fountoulakis
Publication venue
Publication date: 22/03/2016
Field of study

Recent advances in random linear systems on finite fields have paved the way for the construction of constant-time data structures representing static functions and minimal perfect hash functions using less space with respect to existing techniques. The main obstruction for any practical application of these results is the cubic-time Gaussian elimination required to solve these linear systems: despite they can be made very small, the computation is still too slow to be feasible. In this paper we describe in detail a number of heuristics and programming techniques to speed up the resolution of these systems by several orders of magnitude, making the overall construction competitive with the standard and widely used MWHC technique, which is based on hypergraph peeling. In particular, we introduce broadword programming techniques for fast equation manipulation and a lazy Gaussian elimination algorithm. We also describe a number of technical improvements to the data structure which further reduce space usage and improve lookup speed. Our implementation of these techniques yields a minimal perfect hash function data structure occupying 2.24 bits per element, compared to 2.68 for MWHC-based ones, and a static function data structure which reduces the multiplicative overhead from 1.23 to 1.03

arXiv.org e-Print Archive

Crossref

Recognizing sparse perfect elimination bipartite graphs

Author: Bomhoff Matthijs
Publication venue: Department of Applied Mathematics, University of Twente
Publication date: 01/01/2010
Field of study

When applying Gaussian elimination to a sparse matrix, it is desirable to avoid turning zeros into non-zeros to preserve the sparsity. The class of perfect elimination bipartite graphs is closely related to square matrices that Gaussian elimination can be applied to without turning any zero into a non-zero. Existing literature on the recognition of this class and finding suitable pivots mainly focusses on time complexity. For

n \times n

matrices with m non-zero elements, the currently best known algorithm has a time complexity of

O(n^3/\log n)

. However, when viewed from a practical perspective, the space complexity also deserves attention: it may not be worthwhile to look for a suitable set of pivots for a sparse matrix if this requires

\Omega(n^2)

space. We present two new algorithms for the recognition of sparse instances: one with a

O(n m)

time complexity in

\Theta(n^2)

space and one with a

O(m^2)

time complexity in

\Theta(m)

space. Furthermore, if we allow only pivots on the diagonal, our second algorithm can easily be adapted to run in time

O(n m)

CiteSeerX

University of Twente Research Information

Efficient Decomposition of Dense Matrices over GF(2)

Author: Albrecht Martin R.
Pernet Clément
Publication venue
Publication date: 01/01/2010
Field of study

In this work we describe an efficient implementation of a hierarchy of algorithms for the decomposition of dense matrices over the field with two elements (GF(2)). Matrix decomposition is an essential building block for solving dense systems of linear and non-linear equations and thus much research has been devoted to improve the asymptotic complexity of such algorithms. In this work we discuss an implementation of both well-known and improved algorithms in the M4RI library. The focus of our discussion is on a new variant of the M4RI algorithm - denoted MMPF in this work -- which allows for considerable performance gains in practice when compared to the previously fastest implementation. We provide performance figures on x86_64 CPUs to demonstrate the viability of our approach

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Computational linear algebra over finite fields

Author: Dumas Jean-Guillaume
Pernet Clément
Publication venue
Publication date: 17/04/2012
Field of study

We present here algorithms for efficient computation of linear algebra problems over finite fields

arXiv.org e-Print Archive

CiteSeerX

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

An efficient multi-core implementation of a novel HSS-structured multifrontal solver using randomized sampling

Author: Ghysels Pieter
Li Xiaoye S.
Napov Artem
Rouet Francois-Henry
Williams Samuel
Publication venue
Publication date: 25/02/2015
Field of study

We present a sparse linear system solver that is based on a multifrontal variant of Gaussian elimination, and exploits low-rank approximation of the resulting dense frontal matrices. We use hierarchically semiseparable (HSS) matrices, which have low-rank off-diagonal blocks, to approximate the frontal matrices. For HSS matrix construction, a randomized sampling algorithm is used together with interpolative decompositions. The combination of the randomized compression with a fast ULV HSS factorization leads to a solver with lower computational complexity than the standard multifrontal method for many applications, resulting in speedups up to 7 fold for problems in our test suite. The implementation targets many-core systems by using task parallelism with dynamic runtime scheduling. Numerical experiments show performance improvements over state-of-the-art sparse direct solvers. The implementation achieves high performance and good scalability on a range of modern shared memory parallel systems, including the Intel Xeon Phi (MIC). The code is part of a software package called STRUMPACK -- STRUctured Matrices PACKage, which also has a distributed memory component for dense rank-structured matrices

arXiv.org e-Print Archive

eScholarship - University of California

DI-fusion