Search CORE

196,954 research outputs found

Minimizing Communication in Linear Algebra

Author: Blackford L. S.
Grey Ballard
James Demmel
Oded Schwartz
Olga Holtz
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2009
Field of study

In 1981 Hong and Kung proved a lower bound on the amount of communication needed to perform dense, matrix-multiplication using the conventional

O(n^3)

algorithm, where the input matrices were too large to fit in the small, fast memory. In 2004 Irony, Toledo and Tiskin gave a new proof of this result and extended it to the parallel case. In both cases the lower bound may be expressed as

\Omega

(#arithmetic operations /

\sqrt{M}

), where M is the size of the fast memory (or local memory in the parallel case). Here we generalize these results to a much wider variety of algorithms, including LU factorization, Cholesky factorization,

LDL^T

factorization, QR factorization, algorithms for eigenvalues and singular values, i.e., essentially all direct methods of linear algebra. The proof works for dense or sparse matrices, and for sequential or parallel algorithms. In addition to lower bounds on the amount of data moved (bandwidth) we get lower bounds on the number of messages required to move it (latency). We illustrate how to extend our lower bound technique to compositions of linear algebra operations (like computing powers of a matrix), to decide whether it is enough to call a sequence of simpler optimal algorithms (like matrix multiplication) to minimize communication, or if we can do better. We give examples of both. We also show how to extend our lower bounds to certain graph theoretic problems. We point out recently designed algorithms for dense LU, Cholesky, QR, eigenvalue and the SVD problems that attain these lower bounds; implementations of LU and QR show large speedups over conventional linear algebra algorithms in standard libraries like LAPACK and ScaLAPACK. Many open problems remain.Comment: 27 pages, 2 table

arXiv.org e-Print Archive

CiteSeerX

Crossref

Fast Sparse Matrix Multiplication

Author: A. Shpilka
D. Coppersmith
D. Coppersmith
F. Eisenbrand
F.G. Gustavson
J. Cheriyan
J. Nešetřil
K. Mulmuley
M.O. Rabin
N. Alon
N. Alon
P. Bürgisser
R. Raz
R. Seidel
U. Zwick
V. Pan
V. Strassen
X. Huang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2004
Field of study

Let A and B two n n matrices over a ring R (e.g., the reals or the integers) each containing at most m non-zero elements. We present a new algorithm that multiplies A and B using O(m ) algebraic operations (i.e., multiplications, additions and subtractions) over R. The naive matrix multiplication algorithm, on the other hand, may need to perform #(mn) operations to accomplish the same task. For , the new algorithm performs an almost optimal number of only n operations. For m the new algorithm is also faster than the best known matrix multiplication algorithm for dense matrices which uses O(n ) algebraic operations. The new algorithm is obtained using a surprisingly straightforward combination of a simple combinatorial idea and existing fast rectangular matrix multiplication algorithms. We also obtain improved algorithms for the multiplication of more than two sparse matrices

CiteSeerX

Crossref

Computing minimal interpolation bases

Author: Jeannerod Claude-Pierre
Neiger Vincent
Schost Eric
Villard Gilles
Publication venue: 'Elsevier BV'
Publication date: 13/06/2016
Field of study

International audienceWe consider the problem of computing univariate polynomial matrices over afield that represent minimal solution bases for a general interpolationproblem, some forms of which are the vector M-Pad\'e approximation problem in[Van Barel and Bultheel, Numerical Algorithms 3, 1992] and the rationalinterpolation problem in [Beckermann and Labahn, SIAM J. Matrix Anal. Appl. 22,2000]. Particular instances of this problem include the bivariate interpolationsteps of Guruswami-Sudan hard-decision and K\"otter-Vardy soft-decisiondecodings of Reed-Solomon codes, the multivariate interpolation step oflist-decoding of folded Reed-Solomon codes, and Hermite-Pad\'e approximation. In the mentioned references, the problem is solved using iterative algorithmsbased on recurrence relations. Here, we discuss a fast, divide-and-conquerversion of this recurrence, taking advantage of fast matrix computations overthe scalars and over the polynomials. This new algorithm is deterministic, andfor computing shifted minimal bases of relations between

m

vectors of size

\sigma

it uses

O~( m^{\omega-1} (\sigma + |s|) )

field operations, where

\omega

is the exponent of matrix multiplication, and

|s|

is the sum of theentries of the input shift

s

, with

\min(s) = 0

. This complexity boundimproves in particular on earlier algorithms in the case of bivariateinterpolation for soft decoding, while matching fastest existing algorithms forsimultaneous Hermite-Pad\'e approximation

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

An elementary algorithm for computing the determinant of pentadiagonal Toeplitz matrices

Author: Cinkir Zubeyir
Publication venue: Elsevier B.V.
Publication date: 31/03/2012
Field of study

AbstractOver the last 25 years, various fast algorithms for computing the determinant of a pentadiagonal Toeplitz matrices were developed. In this paper, we give a new kind of elementary algorithm requiring 56⋅⌊n−4k⌋+30k+O(logn) operations, where k≥4 is an integer that needs to be chosen freely at the beginning of the algorithm. For example, we can compute det(Tn) in n+O(logn) and 82n+O(logn) operations if we choose k as 56 and ⌊2815(n−4)⌋, respectively. For various applications, it will be enough to test if the determinant of a pentadiagonal Toeplitz matrix is zero or not. As in another result of this paper, we used modular arithmetic to give a fast algorithm determining when determinants of such matrices are non-zero. This second algorithm works only for Toeplitz matrices with rational entries

Elsevier - Publisher Connector