14 research outputs found
Fast Algorithms for Displacement and Low-Rank Structured Matrices
This tutorial provides an introduction to the development of fast matrix
algorithms based on the notions of displacement and various low-rank
structures
A fast semi-direct least squares algorithm for hierarchically block separable matrices
We present a fast algorithm for linear least squares problems governed by
hierarchically block separable (HBS) matrices. Such matrices are generally
dense but data-sparse and can describe many important operators including those
derived from asymptotically smooth radial kernels that are not too oscillatory.
The algorithm is based on a recursive skeletonization procedure that exposes
this sparsity and solves the dense least squares problem as a larger,
equality-constrained, sparse one. It relies on a sparse QR factorization
coupled with iterative weighted least squares methods. In essence, our scheme
consists of a direct component, comprised of matrix compression and
factorization, followed by an iterative component to enforce certain equality
constraints. At most two iterations are typically required for problems that
are not too ill-conditioned. For an HBS matrix with
having bounded off-diagonal block rank, the algorithm has optimal complexity. If the rank increases with the spatial dimension as is
common for operators that are singular at the origin, then this becomes
in 1D, in 2D, and
in 3D. We illustrate the performance of the method on
both over- and underdetermined systems in a variety of settings, with an
emphasis on radial basis function approximation and efficient updating and
downdating.Comment: 24 pages, 8 figures, 6 tables; to appear in SIAM J. Matrix Anal. App
Computing the Nearest Doubly Stochastic Matrix with A Prescribed Entry
In this paper a nearest doubly stochastic matrix problem is studied. This problem is to ¯nd the
closest doubly stochastic matrix with the prescribed (1; 1) entry to a given matrix. According to the
well-established dual theory in optimization, the dual of the underlying problem is an unconstrained
di®erentiable but not twice di®erentiable convex optimization problem. A Newton-type method is used
for solving the associated dual problem and then the desired nearest doubly stochastic matrix is obtained.
Under some mild assumptions, the quadratic convergence of the proposed Newton's method is proved.
The numerical performance of the method is also demonstrated by numerical examples
Row Compression and Nested Product Decomposition of a Hierarchical Representation of a Quasiseparable Matrix
This research introduces a row compression and nested product decomposition of an nxn hierarchical representation of a rank structured matrix A, which extends the compression and nested product decomposition of a quasiseparable matrix. The hierarchical parameter extraction algorithm of a quasiseparable matrix is efficient, requiring only O(nlog(n))operations, and is proven backward stable. The row compression is comprised of a sequence of small Householder transformations that are formed from the low-rank, lower triangular, off-diagonal blocks of the hierarchical representation. The row compression forms a factorization of matrix A, where A = QC, Q is the product of the Householder transformations, and C preserves the low-rank structure in both the lower and upper triangular parts of matrix A. The nested product decomposition is accomplished by applying a sequence of orthogonal transformations to the low-rank, upper triangular, off-diagonal blocks of the compressed matrix C. Both the compression and decomposition algorithms are stable, and require O(nlog(n)) operations. At this point, the matrix-vector product and solver algorithms are the only ones fully proven to be backward stable for quasiseparable matrices. By combining the fast matrix-vector product and system solver, linear systems involving the hierarchical representation to nested product decomposition are directly solved with linear complexity and unconditional stability. Applications in image deblurring and compression, that capitalize on the concepts from the row compression and nested product decomposition algorithms, will be shown
ERROR CONTROL AND EFFICIENT MEMORY MANAGEMENT FOR SPARSE INTEGRAL EQUATION SOLVERS BASED ON LOCAL-GLOBAL SOLUTION MODES
This dissertation presents and analyzes two new algorithms for sparse direct solution methods based on the use of local-global solution (LOGOS) modes. One of the new algorithms is a rigorous error control strategy for LOGOS-based matrix factorizations that utilize overlapped, localizing modes (OL-LOGOS) on a shifted grid. The use of OL-LOGOS modes is critical to obtaining asymptotically efficient factorizations from LOGOS-based methods. Unfortunately, the approach also introduces a non-orthogonal basis function structure. This can cause errors to accumulate across levels of a multilevel implementation, which has previously posed a barrier to rigorous error control for the OL-LOGOS factorization method. This limitation is overcome, and it is shown that it is possible to efficiently decouple the fundamentally non-orthogonal factorization subspaces in a manner that prevents multilevel error propagation. This renders the OL-LOGOS factorization error controllable in a relative RMS sense. The impact of the new, error-controlled OL-LOGOS factorization algorithm on computational resource utilization is discussed and several numerical examples are presented to illustrate the performance of the improved algorithm relative to previously reported results.
The second algorithmic development considered is the development of efficient out-of-core (OOC) versions of the OL-LOGOS factorization algorithm that allow associated software tools to take advantage of additional resources for memory management. The proposed OOC algorithm incorporates a memory page definition that is tailored to match the flow of the OL-LOGOS factorization procedure. Efficiency of the function of the part is evaluated using a quantitative approach, because the tested massive storage device performances do not follow analytical results. The performance latency and the memory usage of the resulting OOC tools are compared with in-core performance results.
Both the new error control algorithm and the OOC method have been incorporated into previously existing software tools, and the dissertation presents results for real-world simulation problems