Search CORE

923 research outputs found

A fast solver for linear systems with displacement structure

Author: Arico' Antonio
Rodriguez Giuseppe
Publication venue
Publication date: 01/01/2010
Field of study

We describe a fast solver for linear systems with reconstructable Cauchy-like structure, which requires O(rn^2) floating point operations and O(rn) memory locations, where n is the size of the matrix and r its displacement rank. The solver is based on the application of the generalized Schur algorithm to a suitable augmented matrix, under some assumptions on the knots of the Cauchy-like matrix. It includes various pivoting strategies, already discussed in the literature, and a new algorithm, which only requires reconstructability. We have developed a software package, written in Matlab and C-MEX, which provides a robust implementation of the above method. Our package also includes solvers for Toeplitz(+Hankel)-like and Vandermonde-like linear systems, as these structures can be reduced to Cauchy-like by fast and stable transforms. Numerical experiments demonstrate the effectiveness of the software.Comment: 27 pages, 6 figure

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università di Cagliari

Block pivoting implementation of a symmetric Toeplitz solver

Author: Alonso-Jordá Pedro
Dolz Zaragozá Manuel Francisco
Vidal Maciá Antonio Manuel
Publication venue: 'Elsevier BV'
Publication date: 01/05/2014
Field of study

Toeplitz matrices are characterized by a special structure that can be exploited in order to obtain fast linear system solvers. These solvers are difficult to parallelize due to their low computational cost and their closely coupled data operations. We propose to transform the Toeplitz system matrix into a Cauchy-like matrix since the latter can be divided into two independent matrices of half the size of the system matrix and each one of these smaller arising matrices can be factorized efficiently in multicore computers. We use OpenMP and store data in memory by blocks in consecutive positions yielding a simple and efficient algorithm. In addition, by exploiting the fact that diagonal pivoting does not destroy the special structure of Cauchy-like matrices, we introduce a local diagonal pivoting technique which improves the accuracy of the solution and the stability of the algorithm.This work was partially supported by the Spanish Ministerio de Ciencia e Innovacion (Project TIN2008-06570-C04-02 and TEC2009-13741), Vicerrectorado de Investigacion de la Universidad Politecnica de Valencia through PAID-05-10 (ref. 2705), and Generalitat Valenciana through project PROMETEO/2009/2013.Alonso-Jordá, P.; Dolz Zaragozá, MF.; Vidal Maciá, AM. (2014). Block pivoting implementation of a symmetric Toeplitz solver. Journal of Parallel and Distributed Computing. 74(5):2392-2399. https://doi.org/10.1016/j.jpdc.2014.02.003S2392239974

Crossref

RiuNet

Fast and accurate con-eigenvalue algorithm for optimal rational approximations

Author: Beylkin G.
Haut T. S.
Publication venue
Publication date: 01/01/2012
Field of study

The need to compute small con-eigenvalues and the associated con-eigenvectors of positive-definite Cauchy matrices naturally arises when constructing rational approximations with a (near) optimally small

L^{\infty}

error. Specifically, given a rational function with

n

poles in the unit disk, a rational approximation with

m\ll n

poles in the unit disk may be obtained from the

m

th con-eigenvector of an

n\times n

Cauchy matrix, where the associated con-eigenvalue

\lambda_{m}>0

gives the approximation error in the

L^{\infty}

norm. Unfortunately, standard algorithms do not accurately compute small con-eigenvalues (and the associated con-eigenvectors) and, in particular, yield few or no correct digits for con-eigenvalues smaller than the machine roundoff. We develop a fast and accurate algorithm for computing con-eigenvalues and con-eigenvectors of positive-definite Cauchy matrices, yielding even the tiniest con-eigenvalues with high relative accuracy. The algorithm computes the

m

th con-eigenvalue in

\mathcal{O}(m^{2}n)

operations and, since the con-eigenvalues of positive-definite Cauchy matrices decay exponentially fast, we obtain (near) optimal rational approximations in

\mathcal{O}(n(\log\delta^{-1})^{2})

operations, where

\delta

is the approximation error in the

L^{\infty}

norm. We derive error bounds demonstrating high relative accuracy of the computed con-eigenvalues and the high accuracy of the unit con-eigenvectors. We also provide examples of using the algorithm to compute (near) optimal rational approximations of functions with singularities and sharp transitions, where approximation errors close to machine precision are obtained. Finally, we present numerical tests on random (complex-valued) Cauchy matrices to show that the algorithm computes all the con-eigenvalues and con-eigenvectors with nearly full precision

arXiv.org e-Print Archive

CiteSeerX

Minimizing Communication in Linear Algebra

Author: Blackford L. S.
Grey Ballard
James Demmel
Oded Schwartz
Olga Holtz
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2009
Field of study

In 1981 Hong and Kung proved a lower bound on the amount of communication needed to perform dense, matrix-multiplication using the conventional

O(n^3)

algorithm, where the input matrices were too large to fit in the small, fast memory. In 2004 Irony, Toledo and Tiskin gave a new proof of this result and extended it to the parallel case. In both cases the lower bound may be expressed as

\Omega

(#arithmetic operations /

\sqrt{M}

), where M is the size of the fast memory (or local memory in the parallel case). Here we generalize these results to a much wider variety of algorithms, including LU factorization, Cholesky factorization,

LDL^T

factorization, QR factorization, algorithms for eigenvalues and singular values, i.e., essentially all direct methods of linear algebra. The proof works for dense or sparse matrices, and for sequential or parallel algorithms. In addition to lower bounds on the amount of data moved (bandwidth) we get lower bounds on the number of messages required to move it (latency). We illustrate how to extend our lower bound technique to compositions of linear algebra operations (like computing powers of a matrix), to decide whether it is enough to call a sequence of simpler optimal algorithms (like matrix multiplication) to minimize communication, or if we can do better. We give examples of both. We also show how to extend our lower bounds to certain graph theoretic problems. We point out recently designed algorithms for dense LU, Cholesky, QR, eigenvalue and the SVD problems that attain these lower bounds; implementations of LU and QR show large speedups over conventional linear algebra algorithms in standard libraries like LAPACK and ScaLAPACK. Many open problems remain.Comment: 27 pages, 2 table

arXiv.org e-Print Archive

CiteSeerX

Crossref

Fast linear algebra is stable

Author: A. Borodin
A. Edelman
A. Schönhage
A.N. Malyshev
A.Ya. Bulgakov
C. Bischof
D. Bini
D. Coppersmith
D. Heller
E. Elmroth
G. Golub
G.W. Stewart
G.W. Stewart
Ioana Dumitriu
J. Demmel
J. Demmel
J. Demmel
J. Roberts
J. Varah
James Demmel
M. Gu
N. Higham
N.J. Higham
Olga Holtz
P. Bürgisser
P. Hong
R. Cormen
R. Schreiber
R.J. Muirhead
S. Chandrasekaran
S. Huss
S. Toledo
S.K. Godunov
T. Chan
T.W. Anderson
V. Strassen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

In an earlier paper, we showed that a large class of fast recursive matrix multiplication algorithms is stable in a normwise sense, and that in fact if multiplication of

n

-by-

n

matrices can be done by any algorithm in

O(n^{\omega + \eta})

operations for any

\eta > 0

, then it can be done stably in

O(n^{\omega + \eta})

operations for any

\eta > 0

. Here we extend this result to show that essentially all standard linear algebra operations, including LU decomposition, QR decomposition, linear equation solving, matrix inversion, solving least squares problems, (generalized) eigenvalue problems and the singular value decomposition can also be done stably (in a normwise sense) in

O(n^{\omega + \eta})

operations.Comment: 26 pages; final version; to appear in Numerische Mathemati

arXiv.org e-Print Archive

CiteSeerX

Crossref

eScholarship - University of California

Row Compression and Nested Product Decomposition of a Hierarchical Representation of a Quasiseparable Matrix

Author: Hudachek-Buswell Mary
Publication venue: ScholarWorks @ Georgia State University
Publication date: 12/08/2014
Field of study

This research introduces a row compression and nested product decomposition of an nxn hierarchical representation of a rank structured matrix A, which extends the compression and nested product decomposition of a quasiseparable matrix. The hierarchical parameter extraction algorithm of a quasiseparable matrix is efficient, requiring only O(nlog(n))operations, and is proven backward stable. The row compression is comprised of a sequence of small Householder transformations that are formed from the low-rank, lower triangular, off-diagonal blocks of the hierarchical representation. The row compression forms a factorization of matrix A, where A = QC, Q is the product of the Householder transformations, and C preserves the low-rank structure in both the lower and upper triangular parts of matrix A. The nested product decomposition is accomplished by applying a sequence of orthogonal transformations to the low-rank, upper triangular, off-diagonal blocks of the compressed matrix C. Both the compression and decomposition algorithms are stable, and require O(nlog(n)) operations. At this point, the matrix-vector product and solver algorithms are the only ones fully proven to be backward stable for quasiseparable matrices. By combining the fast matrix-vector product and system solver, linear systems involving the hierarchical representation to nested product decomposition are directly solved with linear complexity and unconditional stability. Applications in image deblurring and compression, that capitalize on the concepts from the row compression and nested product decomposition algorithms, will be shown

ScholarWorks @ Georgia State University