291 research outputs found
Randomized Dimension Reduction on Massive Data
Scalability of statistical estimators is of increasing importance in modern
applications and dimension reduction is often used to extract relevant
information from data. A variety of popular dimension reduction approaches can
be framed as symmetric generalized eigendecomposition problems. In this paper
we outline how taking into account the low rank structure assumption implicit
in these dimension reduction approaches provides both computational and
statistical advantages. We adapt recent randomized low-rank approximation
algorithms to provide efficient solutions to three dimension reduction methods:
Principal Component Analysis (PCA), Sliced Inverse Regression (SIR), and
Localized Sliced Inverse Regression (LSIR). A key observation in this paper is
that randomization serves a dual role, improving both computational and
statistical performance. This point is highlighted in our experiments on real
and simulated data.Comment: 31 pages, 6 figures, Key Words:dimension reduction, generalized
eigendecompositon, low-rank, supervised, inverse regression, random
projections, randomized algorithms, Krylov subspace method
Krylov-aware stochastic trace estimation
We introduce an algorithm for estimating the trace of a matrix function
using implicit products with a symmetric matrix .
Existing methods for implicit trace estimation of a matrix function tend to
treat matrix-vector products with as a black-box to be computed
by a Krylov subspace method. Like other recent algorithms for implicit trace
estimation, our approach is based on a combination of deflation and stochastic
trace estimation. However, we take a closer look at how products with
are integrated into these approaches which enables several
efficiencies not present in previously studied methods. In particular, we
describe a Krylov subspace method for computing a low-rank approximation of a
matrix function by a computationally efficient projection onto Krylov subspace.Comment: Figure 5.1 differs somewhat from the published version due to a
clerical error made when uploading the images to the journa
Randomized algorithms for low-rank matrix approximation: Design, analysis, and applications
This survey explores modern approaches for computing low-rank approximations
of high-dimensional matrices by means of the randomized SVD, randomized
subspace iteration, and randomized block Krylov iteration. The paper compares
the procedures via theoretical analyses and numerical studies to highlight how
the best choice of algorithm depends on spectral properties of the matrix and
the computational resources available.
Despite superior performance for many problems, randomized block Krylov
iteration has not been widely adopted in computational science. The paper
strengthens the case for this method in three ways. First, it presents new
pseudocode that can significantly reduce computational costs. Second, it
provides a new analysis that yields simple, precise, and informative error
bounds. Last, it showcases applications to challenging scientific problems,
including principal component analysis for genetic data and spectral clustering
for molecular dynamics data.Comment: 60 pages, 14 figure
Numerical computation of the equilibrium-reduced density matrix for strongly coupled open quantum systems
We describe a numerical algorithm for approximating the equilibrium-reduced
density matrix and the effective (mean force) Hamiltonian for a set of system
spins coupled strongly to a set of bath spins when the total system
(system+bath) is held in canonical thermal equilibrium by weak coupling with a
"super-bath". Our approach is a generalization of now standard typicality
algorithms for computing the quantum expectation value of observables of bare
quantum systems via trace estimators and Krylov subspace methods. In
particular, our algorithm makes use of the fact that the reduced system
density, when the bath is measured in a given random state, tends to
concentrate about the corresponding thermodynamic averaged reduced system
density. Theoretical error analysis and numerical experiments are given to
validate the accuracy of our algorithm. Further numerical experiments
demonstrate the potential of our approach for applications including the study
of quantum phase transitions and entanglement entropy for long-range
interaction systems
Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions
Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed—either explicitly or
implicitly—to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, robustness, and/or speed. These claims are supported by extensive numerical experiments and a detailed error analysis. The specific benefits of randomized techniques depend on the computational environment. Consider the model problem of finding the k dominant components of the singular value decomposition of an m × n matrix. (i) For a dense input matrix, randomized algorithms require O(mn log(k))
floating-point operations (flops) in contrast to O(mnk) for classical algorithms. (ii) For a sparse input matrix, the flop count matches classical Krylov subspace methods, but the randomized approach is more robust and can easily be reorganized to exploit multiprocessor architectures. (iii) For a matrix that is too large to fit in fast memory, the randomized techniques require only a constant number of passes over the data, as opposed to O(k) passes for classical algorithms. In fact, it is sometimes possible to perform matrix approximation with a single pass over the data
- …