129 research outputs found
Differential qd algorithm with shifts for rank-structured matrices
Although QR iterations dominate in eigenvalue computations, there are several
important cases when alternative LR-type algorithms may be preferable. In
particular, in the symmetric tridiagonal case where differential qd algorithm
with shifts (dqds) proposed by Fernando and Parlett enjoys often faster
convergence while preserving high relative accuracy (that is not guaranteed in
QR algorithm). In eigenvalue computations for rank-structured matrices QR
algorithm is also a popular choice since, in the symmetric case, the rank
structure is preserved. In the unsymmetric case, however, QR algorithm destroys
the rank structure and, hence, LR-type algorithms come to play once again. In
the current paper we discover several variants of qd algorithms for
quasiseparable matrices. Remarkably, one of them, when applied to Hessenberg
matrices becomes a direct generalization of dqds algorithm for tridiagonal
matrices. Therefore, it can be applied to such important matrices as companion
and confederate, and provides an alternative algorithm for finding roots of a
polynomial represented in the basis of orthogonal polynomials. Results of
preliminary numerical experiments are presented
A fast semi-direct least squares algorithm for hierarchically block separable matrices
We present a fast algorithm for linear least squares problems governed by
hierarchically block separable (HBS) matrices. Such matrices are generally
dense but data-sparse and can describe many important operators including those
derived from asymptotically smooth radial kernels that are not too oscillatory.
The algorithm is based on a recursive skeletonization procedure that exposes
this sparsity and solves the dense least squares problem as a larger,
equality-constrained, sparse one. It relies on a sparse QR factorization
coupled with iterative weighted least squares methods. In essence, our scheme
consists of a direct component, comprised of matrix compression and
factorization, followed by an iterative component to enforce certain equality
constraints. At most two iterations are typically required for problems that
are not too ill-conditioned. For an HBS matrix with
having bounded off-diagonal block rank, the algorithm has optimal complexity. If the rank increases with the spatial dimension as is
common for operators that are singular at the origin, then this becomes
in 1D, in 2D, and
in 3D. We illustrate the performance of the method on
both over- and underdetermined systems in a variety of settings, with an
emphasis on radial basis function approximation and efficient updating and
downdating.Comment: 24 pages, 8 figures, 6 tables; to appear in SIAM J. Matrix Anal. App
Multilevel quasiseparable matrices in PDE-constrained optimization
Optimization problems with constraints in the form of a partial differential
equation arise frequently in the process of engineering design. The
discretization of PDE-constrained optimization problems results in large-scale
linear systems of saddle-point type. In this paper we propose and develop a
novel approach to solving such systems by exploiting so-called quasiseparable
matrices. One may think of a usual quasiseparable matrix as of a discrete
analog of the Green's function of a one-dimensional differential operator. Nice
feature of such matrices is that almost every algorithm which employs them has
linear complexity. We extend the application of quasiseparable matrices to
problems in higher dimensions. Namely, we construct a class of preconditioners
which can be computed and applied at a linear computational cost. Their use
with appropriate Krylov methods leads to algorithms of nearly linear
complexity
Row Compression and Nested Product Decomposition of a Hierarchical Representation of a Quasiseparable Matrix
This research introduces a row compression and nested product decomposition of an nxn hierarchical representation of a rank structured matrix A, which extends the compression and nested product decomposition of a quasiseparable matrix. The hierarchical parameter extraction algorithm of a quasiseparable matrix is efficient, requiring only O(nlog(n))operations, and is proven backward stable. The row compression is comprised of a sequence of small Householder transformations that are formed from the low-rank, lower triangular, off-diagonal blocks of the hierarchical representation. The row compression forms a factorization of matrix A, where A = QC, Q is the product of the Householder transformations, and C preserves the low-rank structure in both the lower and upper triangular parts of matrix A. The nested product decomposition is accomplished by applying a sequence of orthogonal transformations to the low-rank, upper triangular, off-diagonal blocks of the compressed matrix C. Both the compression and decomposition algorithms are stable, and require O(nlog(n)) operations. At this point, the matrix-vector product and solver algorithms are the only ones fully proven to be backward stable for quasiseparable matrices. By combining the fast matrix-vector product and system solver, linear systems involving the hierarchical representation to nested product decomposition are directly solved with linear complexity and unconditional stability. Applications in image deblurring and compression, that capitalize on the concepts from the row compression and nested product decomposition algorithms, will be shown
An efficient multi-core implementation of a novel HSS-structured multifrontal solver using randomized sampling
We present a sparse linear system solver that is based on a multifrontal
variant of Gaussian elimination, and exploits low-rank approximation of the
resulting dense frontal matrices. We use hierarchically semiseparable (HSS)
matrices, which have low-rank off-diagonal blocks, to approximate the frontal
matrices. For HSS matrix construction, a randomized sampling algorithm is used
together with interpolative decompositions. The combination of the randomized
compression with a fast ULV HSS factorization leads to a solver with lower
computational complexity than the standard multifrontal method for many
applications, resulting in speedups up to 7 fold for problems in our test
suite. The implementation targets many-core systems by using task parallelism
with dynamic runtime scheduling. Numerical experiments show performance
improvements over state-of-the-art sparse direct solvers. The implementation
achieves high performance and good scalability on a range of modern shared
memory parallel systems, including the Intel Xeon Phi (MIC). The code is part
of a software package called STRUMPACK -- STRUctured Matrices PACKage, which
also has a distributed memory component for dense rank-structured matrices
- …