37 research outputs found
Simple Analyses of the Sparse Johnson-Lindenstrauss Transform
For every n-point subset X of Euclidean space and target distortion 1+eps for 0l_2^m where f(x) = Ax for A a matrix with m rows where (1) m = O((log n)/eps^2), and (2) each column of A is sparse, having only O(eps m) non-zero entries. Though the constructions given for such A in (Kane, Nelson, J. ACM 2014) are simple, the analyses are not, employing intricate combinatorial arguments. We here give two simple alternative proofs of their main result, involving no delicate combinatorics. One of these proofs has already been tested pedagogically, requiring slightly under forty minutes by the third author at a casual pace to cover all details in a blackboard course lecture
Simple Analysis of Sparse, Sign-Consistent JL
Allen-Zhu, Gelashvili, Micali, and Shavit construct a sparse, sign-consistent Johnson-Lindenstrauss distribution, and prove that this distribution yields an essentially optimal dimension for the correct choice of sparsity. However, their analysis of the upper bound on the dimension and sparsity requires a complicated combinatorial graph-based argument similar to Kane and Nelson\u27s analysis of sparse JL. We present a simple, combinatorics-free analysis of sparse, sign-consistent JL that yields the same dimension and sparsity upper bounds as the original analysis. Our analysis also yields dimension/sparsity tradeoffs, which were not previously known.
As with previous proofs in this area, our analysis is based on applying Markov\u27s inequality to the pth moment of an error term that can be expressed as a quadratic form of Rademacher variables. Interestingly, we show that, unlike in previous work in the area, the traditionally used Hanson-Wright bound is not strong enough to yield our desired result. Indeed, although the Hanson-Wright bound is known to be optimal for gaussian degree-2 chaos, it was already shown to be suboptimal for Rademachers. Surprisingly, we are able to show a simple moment bound for quadratic forms of Rademachers that is sufficiently tight to achieve our desired result, which given the ubiquity of moment and tail bounds in theoretical computer science, is likely to be of broader interest
Practical sketching algorithms for low-rank matrix approximation
This paper describes a suite of algorithms for constructing low-rank
approximations of an input matrix from a random linear image of the matrix,
called a sketch. These methods can preserve structural properties of the input
matrix, such as positive-semidefiniteness, and they can produce approximations
with a user-specified rank. The algorithms are simple, accurate, numerically
stable, and provably correct. Moreover, each method is accompanied by an
informative error bound that allows users to select parameters a priori to
achieve a given approximation quality. These claims are supported by numerical
experiments with real and synthetic data
Sharper Bounds for Regularized Data Fitting
We study matrix sketching methods for regularized variants of linear regression, low rank approximation, and canonical correlation analysis. Our main focus is on sketching techniques which preserve the objective function value for regularized problems, which is an area that has remained largely unexplored. We study regularization both in a fairly broad setting, and in the specific context of the popular and widely used technique of ridge regularization; for the latter, as applied to each of these problems, we show algorithmic resource bounds in which the statistical dimension appears in places where in previous bounds the rank would appear. The statistical dimension is always smaller than the rank, and decreases as the amount of regularization increases. In particular we show this for the ridge low-rank approximation problem as well as regularized low-rank approximation problems in a much more general setting, where the regularizing function satisfies some very general conditions (chiefly, invariance under orthogonal transformations)
Randomized low-rank approximation for symmetric indefinite matrices
The Nystr\"om method is a popular choice for finding a low-rank approximation
to a symmetric positive semi-definite matrix. The method can fail when applied
to symmetric indefinite matrices, for which the error can be unboundedly large.
In this work, we first identify the main challenges in finding a Nystr\"om
approximation to symmetric indefinite matrices. We then prove the existence of
a variant that overcomes the instability, and establish relative-error nuclear
norm bounds of the resulting approximation that hold when the singular values
decay rapidly. The analysis naturally leads to a practical algorithm, whose
robustness is illustrated with experiments.Comment: 24 pages, 6 figure