Search CORE

78 research outputs found

randUTV: A blocked randomized algorithm for computing a rank-revealing UTV factorization

Author: Heavner Nathan
Martinsson Per-Gunnar
Quintana-Orti Gregorio
Publication venue
Publication date: 02/03/2017
Field of study

This manuscript describes the randomized algorithm randUTV for computing a so called UTV factorization efficiently. Given a matrix

A

, the algorithm computes a factorization

A = UTV^{*}

, where

U

and

V

have orthonormal columns, and

T

is triangular (either upper or lower, whichever is preferred). The algorithm randUTV is developed primarily to be a fast and easily parallelized alternative to algorithms for computing the Singular Value Decomposition (SVD). randUTV provides accuracy very close to that of the SVD for problems such as low-rank approximation, solving ill-conditioned linear systems, determining bases for various subspaces associated with the matrix, etc. Moreover, randUTV produces highly accurate approximations to the singular values of

A

. Unlike the SVD, the randomized algorithm proposed builds a UTV factorization in an incremental, single-stage, and non-iterative way, making it possible to halt the factorization process once a specified tolerance has been met. Numerical experiments comparing the accuracy and speed of randUTV to the SVD are presented. These experiments demonstrate that in comparison to column pivoted QR, which is another factorization that is often used as a relatively economic alternative to the SVD, randUTV compares favorably in terms of speed while providing far higher accuracy

arXiv.org e-Print Archive

Oxford University Research Archive

Recommended from our members

Building Rank-Revealing Factorizations with Randomization

Author: Heavner Nathan
Publication venue: University of Colorado Boulder
Publication date: 01/01/2019
Field of study

This thesis describes a set of randomized algorithms for computing rank revealing factorizations of matrices. These algorithms are designed specifically to minimize the amount of data movement required, which is essential to high practical performance on modern computing hardware. The work presented builds on existing randomized algorithms for computing low-rank approximations to matrices, but essentially ex- tends the range of applicability of these methods by allowing for the efficient decomposition of matrices of any numerical rank, including full rank matrices. In contrast, existing methods worked well only when the numerical rank was substantially smaller than the dimensions of the matrix.The thesis describes algorithms for computing two of the most popular rank-revealing matrix decom- positions: the column pivoted QR (CPQR) decomposition, and the so called UTV decomposition that factors a given matrix A as A = UTV∗, where U and V have orthonormal columns and T is triangular. For each algorithm, the thesis presents algorithms that are tailored for different computing environments, including multicore shared memory processors, GPUs, distributed memory machines, and matrices that are stored on hard drives (“out of core”).The first chapter of the thesis consists of an introduction that provides context, reviews previous work in the field, and summarizes the key contributions. Beside the introduction, the thesis contains six additional chapters:Chapter 2 introduces a fully blocked algorithm HQRRP for computing a QR factorization with col- umn pivoting. The key to the full blocking of the algorithm lies in using randomized projections to create a low dimensional sketch of the data, where multiple good pivot columns may be cheaply computed. Nu- merical experiments show that HQRRP is several times faster than the classical algorithm for computing a column pivoted QR on a multicore machine, and the acceleration factor increases with the number of cores.Chapter 3 introduces randUTV, a randomized algorithm for computing a rank-revealing factorizationof the form A = UTV∗, where U and V are orthogonal and T is upper triangular. RandUTV uses random- ized methods to efficiently build U and V as approximations of the column and row spaces of A. The result is an algorithm that reveals rank nearly as well as the SVD and costs at most as much as a column pivoted QR.Chapter 4 provides optimized implementations for shared and distributed memory architectures. For shared memory, we show that formulating randUTV as an algorithm-by-blocks increases its efficiency in parallel. The fifth chapter implements randUTV on the GPU and augments the algorithm with an over- sampling technique to further increase the low rank approximation properties of the resulting factorization. Chapter 6 implements both randUTV and HQRRP for use with matrices stored out of core. It is shown that reorganizing HQRRP as a left-looking algorithm to reduce the number of writes to the drive is in the tested cases necessary for the scalability of the algorithm when using spinning disk storage. Finally, chapter 7 discusses an alternative use for randUTV as a nuclear norm estimator and measures the acceleration gained from trimming down the algorithm when only singular value estimates are required

CU Scholar Institutional Repository

randUTV: A Blocked Randomized Algorithm for Computing a Rank-Revealing UTV Factorization

Author: Heavner Nathan
Martinsson Gunnar
Quintana-Ortí Gregorio
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

A randomized algorithm for computing a so-called UTV factorization efficiently is presented. Given a matrix , the algorithm “randUTV” computes a factorization , where and have orthonormal columns, and is triangular (either upper or lower, whichever is preferred). The algorithm randUTV is developed primarily to be a fast and easily parallelized alternative to algorithms for computing the Singular Value Decomposition (SVD). randUTV provides accuracy very close to that of the SVD for problems such as low-rank approximation, solving ill-conditioned linear systems, and determining bases for various subspaces associated with the matrix. Moreover, randUTV produces highly accurate approximations to the singular values of . Unlike the SVD, the randomized algorithm proposed builds a UTV factorization in an incremental, single-stage, and noniterative way, making it possible to halt the factorization process once a specified tolerance has been met. Numerical experiments comparing the accuracy and speed of randUTV to the SVD are presented. Other experiments also demonstrate that in comparison to column-pivoted QR, which is another factorization that is often used as a relatively economic alternative to the SVD, randUTV compares favorably in terms of speed while providing far higher accuracy

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositori Institucional de la Universitat Jaume I

Computing rank-revealing factorizations of matrices stored out-of-core

Author: Heavner Nathan
MARTINSSON GUNNAR
Quintana-Ortí Gregorio
Publication venue: Wiley
Publication date: 17/04/2023
Field of study

This paper describes efficient algorithms for computing rank-revealing factorizations of matrices that are too large to fit in main memory (RAM), and must instead be stored on slow external memory devices such as disks (out-of-core or out-of-memory). Traditional algorithms for computing rank-revealing factorizations (such as the column pivoted QR factorization and the singular value decomposition) are very communication intensive as they require many vector-vector and matrix-vector operations, which become prohibitively expensive when data is not in RAM. Randomization allows to reformulate new methods so that large contiguous blocks of the matrix are processed in bulk. The paper describes two distinct methods. The first is a blocked version of column pivoted Householder QR, organized as a “left-looking” method to minimize the number of the expensive write operations. The second method results employs a UTV factorization. It is organized as an algorithm-by-blocks to overlap computations and I/O operations. As it incorporates power iterations, it is much better at revealing the numerical rank. Numerical experiments on several computers demonstrate that the new algorithms are almost as fast when processing data stored on slow memory devices as traditional algorithms are for data stored in RAM

Repositori Institucional de la Universitat Jaume I

Efficient algorithms for computing rank-revealing factorizations on a GPU

Author: Chen Chao
Gopal Abinand
Heavner Nathan
Martinsson Per-Gunnar
Publication venue
Publication date: 24/06/2021
Field of study

Standard rank-revealing factorizations such as the singular value decomposition and column pivoted QR factorization are challenging to implement efficiently on a GPU. A major difficulty in this regard is the inability of standard algorithms to cast most operations in terms of the Level-3 BLAS. This paper presents two alternative algorithms for computing a rank-revealing factorization of the form

A = U T V^*

, where

U

and

V

are orthogonal and

T

is triangular. Both algorithms use randomized projection techniques to cast most of the flops in terms of matrix-matrix multiplication, which is exceptionally efficient on the GPU. Numerical experiments illustrate that these algorithms achieve an order of magnitude acceleration over finely tuned GPU implementations of the SVD while providing low-rank approximation errors close to that of the SVD

arXiv.org e-Print Archive

Householder QR Factorization With Randomization for Column Pivoting (HQRRP)

Author: Heavner Nathan
Martinsson Gunnar
Quintana-Ortí Gregorio
Van de Geijn Robert A.
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2017
Field of study

A fundamental problem when adding column pivoting to the Householder QR fac- torization is that only about half of the computation can be cast in terms of high performing matrix- matrix multiplications, which greatly limits the bene ts that can be derived from so-called blocking of algorithms. This paper describes a technique for selecting groups of pivot vectors by means of randomized projections. It is demonstrated that the asymptotic op count for the proposed method is 2mn2 ��(2=3)n3 for an m n matrix, identical to that of the best classical unblocked Householder QR factorization algorithm (with or without pivoting). Experiments demonstrate acceleration in speed of close to an order of magnitude relative to the geqp3 function in LAPACK, when executed on a modern CPU with multiple cores. Further, experiments demonstrate that the quality of the randomized pivot selection strategy is roughly the same as that of classical column pivoting. The described algorithm is made available under Open Source license and can be used with LAPACK or libflame

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositori Institucional de la Universitat Jaume I

Aurorasaurus database of real-time, crowd-sourced aurora data for space weather research

Author: Case Nathan Anthony
Heavner Matt
Kosar Burcu
MacDonald Elizabeth A.
Publication venue: 'American Geophysical Union (AGU)'
Publication date: 01/12/2018
Field of study

This technical report documents the details of Aurorasaurus citizen science data for the period spanning 2015 and 2016 as well as its routine data filtering protocols. Aurorasaurus citizen science data is a collection of auroral sightings submitted to the project via its website or apps and mined from social media. It is a robust data set and particularly abundant during strong geomagnetic storms when auroral precipitation models have the highest uncertainty. These data are offered to the scientific community for use through an open‐access database in its raw and scientific formats, each of which is described in detail in this technical report. Furthermore, by demonstrating its scientific utility, we aim to encourage its integration into auroral research

Crossref

NASA Technical Reports Server

Lancaster E-Prints

Recommended from our members

Reproducible big data science: A case study in continuous FAIRness

Author: Chard Kyle
D'Arcy Mike
Deutsch Eric
Foster Ian
Funk Cory
Glusman Gustavo
Heavner Ben
Jung Segun C.
Kesselman Carl
Madduri Ravi
Price Nathan
Richards Matthew
Rodriguez Alexis
Shannon Paul
Sulakhe Dinanath
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 07/06/2023
Field of study

Big biomedical data create exciting opportunities for discovery, but make it difficult to capture analyses and outputs in forms that are findable, accessible, interoperable, and reusable (FAIR). In response, we describe tools that make it easy to capture, and assign identifiers to, data and code throughout the data lifecycle. We illustrate the use of these tools via a case study involving a multi-step analysis that creates an atlas of putative transcription factor binding sites from terabytes of ENCODE DNase I hypersensitive sites sequencing data. We show how the tools automate routine but complex tasks, capture analysis algorithms in understandable and reusable forms, and harness fast networks and powerful cloud computers to process data rapidly, all without sacrificing usability or reproducibility—thus ensuring that big data are not hard-to-(re)use data. We evaluate our approach via a user study, and show that 91% of participants were able to replicate a complex analysis involving considerable data volumes

Knowledge UChicago