27 research outputs found

    Householder QR Factorization With Randomization for Column Pivoting (HQRRP)

    Get PDF
    A fundamental problem when adding column pivoting to the Householder QR fac- torization is that only about half of the computation can be cast in terms of high performing matrix- matrix multiplications, which greatly limits the bene ts that can be derived from so-called blocking of algorithms. This paper describes a technique for selecting groups of pivot vectors by means of randomized projections. It is demonstrated that the asymptotic op count for the proposed method is 2mn2 �����(2=3)n3 for an m n matrix, identical to that of the best classical unblocked Householder QR factorization algorithm (with or without pivoting). Experiments demonstrate acceleration in speed of close to an order of magnitude relative to the geqp3 function in LAPACK, when executed on a modern CPU with multiple cores. Further, experiments demonstrate that the quality of the randomized pivot selection strategy is roughly the same as that of classical column pivoting. The described algorithm is made available under Open Source license and can be used with LAPACK or libflame

    On recursive least-squares filtering algorithms and implementations

    Get PDF
    In many real-time signal processing applications, fast and numerically stable algorithms for solving least-squares problems are necessary and important. In particular, under non-stationary conditions, these algorithms must be able to adapt themselves to reflect the changes in the system and take appropriate adjustments to achieve optimum performances. Among existing algorithms, the QR-decomposition (QRD)-based recursive least-squares (RLS) methods have been shown to be useful and effective for adaptive signal processing. In order to increase the speed of processing and achieve high throughput rate, many algorithms are being vectorized and/or pipelined to facilitate high degrees of parallelism. A time-recursive formulation of RLS filtering employing block QRD will be considered first. Several methods, including a new non-continuous windowing scheme based on selectively rejecting contaminated data, were investigated for adaptive processing. Based on systolic triarrays, many other forms of systolic arrays are shown to be capable of implementing different algorithms. Various updating and downdating systolic algorithms and architectures for RLS filtering are examined and compared in details, which include Householder reflector, Gram-Schmidt procedure, and Givens rotation. A unified approach encompassing existing square-root-free algorithms is also proposed. For the sinusoidal spectrum estimation problem, a judicious method of separating the noise from the signal is of great interest. Various truncated QR methods are proposed for this purpose and compared to the truncated SVD method. Computer simulations provided for detailed comparisons show the effectiveness of these methods. This thesis deals with fundamental issues of numerical stability, computational efficiency, adaptivity, and VLSI implementation for the RLS filtering problems. In all, various new and modified algorithms and architectures are proposed and analyzed; the significance of any of the new method depends crucially on specific application

    A fast semi-direct least squares algorithm for hierarchically block separable matrices

    Full text link
    We present a fast algorithm for linear least squares problems governed by hierarchically block separable (HBS) matrices. Such matrices are generally dense but data-sparse and can describe many important operators including those derived from asymptotically smooth radial kernels that are not too oscillatory. The algorithm is based on a recursive skeletonization procedure that exposes this sparsity and solves the dense least squares problem as a larger, equality-constrained, sparse one. It relies on a sparse QR factorization coupled with iterative weighted least squares methods. In essence, our scheme consists of a direct component, comprised of matrix compression and factorization, followed by an iterative component to enforce certain equality constraints. At most two iterations are typically required for problems that are not too ill-conditioned. For an M×NM \times N HBS matrix with M≥NM \geq N having bounded off-diagonal block rank, the algorithm has optimal O(M+N)\mathcal{O} (M + N) complexity. If the rank increases with the spatial dimension as is common for operators that are singular at the origin, then this becomes O(M+N)\mathcal{O} (M + N) in 1D, O(M+N3/2)\mathcal{O} (M + N^{3/2}) in 2D, and O(M+N2)\mathcal{O} (M + N^{2}) in 3D. We illustrate the performance of the method on both over- and underdetermined systems in a variety of settings, with an emphasis on radial basis function approximation and efficient updating and downdating.Comment: 24 pages, 8 figures, 6 tables; to appear in SIAM J. Matrix Anal. App

    UTV Tools:Matlab Templates for Rank-Revealing UTV Decompositions

    Get PDF
    published in Numerical Algorithms and the paper's text is reprinted here by kind permissio

    randUTV: A blocked randomized algorithm for computing a rank-revealing UTV factorization

    Full text link
    This manuscript describes the randomized algorithm randUTV for computing a so called UTV factorization efficiently. Given a matrix AA, the algorithm computes a factorization A=UTV∗A = UTV^{*}, where UU and VV have orthonormal columns, and TT is triangular (either upper or lower, whichever is preferred). The algorithm randUTV is developed primarily to be a fast and easily parallelized alternative to algorithms for computing the Singular Value Decomposition (SVD). randUTV provides accuracy very close to that of the SVD for problems such as low-rank approximation, solving ill-conditioned linear systems, determining bases for various subspaces associated with the matrix, etc. Moreover, randUTV produces highly accurate approximations to the singular values of AA. Unlike the SVD, the randomized algorithm proposed builds a UTV factorization in an incremental, single-stage, and non-iterative way, making it possible to halt the factorization process once a specified tolerance has been met. Numerical experiments comparing the accuracy and speed of randUTV to the SVD are presented. These experiments demonstrate that in comparison to column pivoted QR, which is another factorization that is often used as a relatively economic alternative to the SVD, randUTV compares favorably in terms of speed while providing far higher accuracy

    UTV Tools:Matlab Templates for Rank-Revealing UTV Decompositions

    Get PDF
    We describe a Matlab 5.2 package for computing and modifying certain rank-revealing decompositions that have found widespread use in signal processing and other applications. The package focuses on algorithms for URV and ULV decompositions, collectively known as UTV decompositions. We include algorithms for the ULLV decomposition, which generalizes the ULV decomposition to a pair of matrices. For completeness a few algorithms for computation of the RRQR decomposition are also included. The software in this package can be used as is, or can be considered as templates for specialized implementations on signal processors and similar dedicated hardware platforms

    randUTV: A Blocked Randomized Algorithm for Computing a Rank-Revealing UTV Factorization

    Get PDF
    A randomized algorithm for computing a so-called UTV factorization efficiently is presented. Given a matrix , the algorithm “randUTV” computes a factorization , where and have orthonormal columns, and is triangular (either upper or lower, whichever is preferred). The algorithm randUTV is developed primarily to be a fast and easily parallelized alternative to algorithms for computing the Singular Value Decomposition (SVD). randUTV provides accuracy very close to that of the SVD for problems such as low-rank approximation, solving ill-conditioned linear systems, and determining bases for various subspaces associated with the matrix. Moreover, randUTV produces highly accurate approximations to the singular values of . Unlike the SVD, the randomized algorithm proposed builds a UTV factorization in an incremental, single-stage, and noniterative way, making it possible to halt the factorization process once a specified tolerance has been met. Numerical experiments comparing the accuracy and speed of randUTV to the SVD are presented. Other experiments also demonstrate that in comparison to column-pivoted QR, which is another factorization that is often used as a relatively economic alternative to the SVD, randUTV compares favorably in terms of speed while providing far higher accuracy

    Approximate Inference for Wireless Communications

    Get PDF
    corecore