78 research outputs found
randUTV: A blocked randomized algorithm for computing a rank-revealing UTV factorization
This manuscript describes the randomized algorithm randUTV for computing a so
called UTV factorization efficiently. Given a matrix , the algorithm
computes a factorization , where and have orthonormal
columns, and is triangular (either upper or lower, whichever is preferred).
The algorithm randUTV is developed primarily to be a fast and easily
parallelized alternative to algorithms for computing the Singular Value
Decomposition (SVD). randUTV provides accuracy very close to that of the SVD
for problems such as low-rank approximation, solving ill-conditioned linear
systems, determining bases for various subspaces associated with the matrix,
etc. Moreover, randUTV produces highly accurate approximations to the singular
values of . Unlike the SVD, the randomized algorithm proposed builds a UTV
factorization in an incremental, single-stage, and non-iterative way, making it
possible to halt the factorization process once a specified tolerance has been
met. Numerical experiments comparing the accuracy and speed of randUTV to the
SVD are presented. These experiments demonstrate that in comparison to column
pivoted QR, which is another factorization that is often used as a relatively
economic alternative to the SVD, randUTV compares favorably in terms of speed
while providing far higher accuracy
Recommended from our members
Building Rank-Revealing Factorizations with Randomization
This thesis describes a set of randomized algorithms for computing rank revealing factorizations of matrices. These algorithms are designed specifically to minimize the amount of data movement required, which is essential to high practical performance on modern computing hardware. The work presented builds on existing randomized algorithms for computing low-rank approximations to matrices, but essentially ex- tends the range of applicability of these methods by allowing for the efficient decomposition of matrices of any numerical rank, including full rank matrices. In contrast, existing methods worked well only when the numerical rank was substantially smaller than the dimensions of the matrix.The thesis describes algorithms for computing two of the most popular rank-revealing matrix decom- positions: the column pivoted QR (CPQR) decomposition, and the so called UTV decomposition that factors a given matrix A as A = UTVâ, where U and V have orthonormal columns and T is triangular. For each algorithm, the thesis presents algorithms that are tailored for different computing environments, including multicore shared memory processors, GPUs, distributed memory machines, and matrices that are stored on hard drives (âout of coreâ).The first chapter of the thesis consists of an introduction that provides context, reviews previous work in the field, and summarizes the key contributions. Beside the introduction, the thesis contains six additional chapters:Chapter 2 introduces a fully blocked algorithm HQRRP for computing a QR factorization with col- umn pivoting. The key to the full blocking of the algorithm lies in using randomized projections to create a low dimensional sketch of the data, where multiple good pivot columns may be cheaply computed. Nu- merical experiments show that HQRRP is several times faster than the classical algorithm for computing a column pivoted QR on a multicore machine, and the acceleration factor increases with the number of cores.Chapter 3 introduces randUTV, a randomized algorithm for computing a rank-revealing factorizationof the form A = UTVâ, where U and V are orthogonal and T is upper triangular. RandUTV uses random- ized methods to efficiently build U and V as approximations of the column and row spaces of A. The result is an algorithm that reveals rank nearly as well as the SVD and costs at most as much as a column pivoted QR.Chapter 4 provides optimized implementations for shared and distributed memory architectures. For shared memory, we show that formulating randUTV as an algorithm-by-blocks increases its efficiency in parallel. The fifth chapter implements randUTV on the GPU and augments the algorithm with an over- sampling technique to further increase the low rank approximation properties of the resulting factorization. Chapter 6 implements both randUTV and HQRRP for use with matrices stored out of core. It is shown that reorganizing HQRRP as a left-looking algorithm to reduce the number of writes to the drive is in the tested cases necessary for the scalability of the algorithm when using spinning disk storage. Finally, chapter 7 discusses an alternative use for randUTV as a nuclear norm estimator and measures the acceleration gained from trimming down the algorithm when only singular value estimates are required
randUTV: A Blocked Randomized Algorithm for Computing a Rank-Revealing UTV Factorization
A randomized algorithm for computing a so-called UTV factorization efficiently is presented. Given a matrix , the algorithm ârandUTVâ computes a factorization , where and have orthonormal columns, and is triangular (either upper or lower, whichever is preferred). The algorithm randUTV is developed primarily to be a fast and easily parallelized alternative to algorithms for computing the Singular Value Decomposition (SVD). randUTV provides accuracy very close to that of the SVD for problems such as low-rank approximation, solving ill-conditioned linear systems, and determining bases for various subspaces associated with the matrix. Moreover, randUTV produces highly accurate approximations to the singular values of . Unlike the SVD, the randomized algorithm proposed builds a UTV factorization in an incremental, single-stage, and noniterative way, making it possible to halt the factorization process once a specified tolerance has been met. Numerical experiments comparing the accuracy and speed of randUTV to the SVD are presented. Other experiments also demonstrate that in comparison to column-pivoted QR, which is another factorization that is often used as a relatively economic alternative to the SVD, randUTV compares favorably in terms of speed while providing far higher accuracy
Computing rank-revealing factorizations of matrices stored out-of-core
This paper describes efficient algorithms for computing rank-revealing factorizations of matrices that are too large to fit in main memory (RAM), and must instead be stored on slow external memory devices such as disks (out-of-core or out-of-memory). Traditional algorithms for computing rank-revealing factorizations (such as the column pivoted QR factorization and the singular value decomposition) are very communication intensive as they require many vector-vector and matrix-vector operations, which become prohibitively expensive when data is not in RAM. Randomization allows to reformulate new methods so that large contiguous blocks of the matrix are processed in bulk. The paper describes two distinct methods. The first is a blocked version of column pivoted Householder QR, organized as a âleft-lookingâ method to minimize the number of the expensive write operations. The second method results employs a UTV factorization. It is organized as an algorithm-by-blocks to overlap computations and I/O operations. As it incorporates power iterations, it is much better at revealing the numerical rank. Numerical experiments on several computers demonstrate that the new algorithms are almost as fast when processing data stored on slow memory devices as traditional algorithms are for data stored in RAM
Efficient algorithms for computing rank-revealing factorizations on a GPU
Standard rank-revealing factorizations such as the singular value
decomposition and column pivoted QR factorization are challenging to implement
efficiently on a GPU. A major difficulty in this regard is the inability of
standard algorithms to cast most operations in terms of the Level-3 BLAS. This
paper presents two alternative algorithms for computing a rank-revealing
factorization of the form , where and are orthogonal and
is triangular. Both algorithms use randomized projection techniques to cast
most of the flops in terms of matrix-matrix multiplication, which is
exceptionally efficient on the GPU. Numerical experiments illustrate that these
algorithms achieve an order of magnitude acceleration over finely tuned GPU
implementations of the SVD while providing low-rank approximation errors close
to that of the SVD
Householder QR Factorization With Randomization for Column Pivoting (HQRRP)
A fundamental problem when adding column pivoting to the Householder QR fac-
torization is that only about half of the computation can be cast in terms of high performing matrix-
matrix multiplications, which greatly limits the bene ts that can be derived from so-called blocking
of algorithms. This paper describes a technique for selecting groups of pivot vectors by means of
randomized projections. It is demonstrated that the asymptotic
op count for the proposed method
is 2mn2 �����(2=3)n3 for an m n matrix, identical to that of the best classical unblocked Householder
QR factorization algorithm (with or without pivoting). Experiments demonstrate acceleration in
speed of close to an order of magnitude relative to the geqp3 function in LAPACK, when executed
on a modern CPU with multiple cores. Further, experiments demonstrate that the quality of the
randomized pivot selection strategy is roughly the same as that of classical column pivoting. The
described algorithm is made available under Open Source license and can be used with LAPACK or
libflame
Aurorasaurus database of real-time, crowd-sourced aurora data for space weather research
This technical report documents the details of Aurorasaurus citizen science data for the period spanning 2015 and 2016 as well as its routine data filtering protocols. Aurorasaurus citizen science data is a collection of auroral sightings submitted to the project via its website or apps and mined from social media. It is a robust data set and particularly abundant during strong geomagnetic storms when auroral precipitation models have the highest uncertainty. These data are offered to the scientific community for use through an openâaccess database in its raw and scientific formats, each of which is described in detail in this technical report. Furthermore, by demonstrating its scientific utility, we aim to encourage its integration into auroral research
Recommended from our members
Reproducible big data science: A case study in continuous FAIRness
Big biomedical data create exciting opportunities for discovery, but make it difficult to capture analyses and outputs in forms that are findable, accessible, interoperable, and reusable (FAIR). In response, we describe tools that make it easy to capture, and assign identifiers to, data and code throughout the data lifecycle. We illustrate the use of these tools via a case study involving a multi-step analysis that creates an atlas of putative transcription factor binding sites from terabytes of ENCODE DNase I hypersensitive sites sequencing data. We show how the tools automate routine but complex tasks, capture analysis algorithms in understandable and reusable forms, and harness fast networks and powerful cloud computers to process data rapidly, all without sacrificing usability or reproducibilityâthus ensuring that big data are not hard-to-(re)use data. We evaluate our approach via a user study, and show that 91% of participants were able to replicate a complex analysis involving considerable data volumes
- âŚ