5,845 research outputs found
Assessing stochastic algorithms for large scale nonlinear least squares problems using extremal probabilities of linear combinations of gamma random variables
This article considers stochastic algorithms for efficiently solving a class
of large scale non-linear least squares (NLS) problems which frequently arise
in applications. We propose eight variants of a practical randomized algorithm
where the uncertainties in the major stochastic steps are quantified. Such
stochastic steps involve approximating the NLS objective function using
Monte-Carlo methods, and this is equivalent to the estimation of the trace of
corresponding symmetric positive semi-definite (SPSD) matrices. For the latter,
we prove tight necessary and sufficient conditions on the sample size (which
translates to cost) to satisfy the prescribed probabilistic accuracy. We show
that these conditions are practically computable and yield small sample sizes.
They are then incorporated in our stochastic algorithm to quantify the
uncertainty in each randomized step. The bounds we use are applications of more
general results regarding extremal tail probabilities of linear combinations of
gamma distributed random variables. We derive and prove new results concerning
the maximal and minimal tail probabilities of such linear combinations, which
can be considered independently of the rest of this paper
Optimal approximate matrix product in terms of stable rank
We prove, using the subspace embedding guarantee in a black box way, that one
can achieve the spectral norm guarantee for approximate matrix multiplication
with a dimensionality-reducing map having
rows. Here is the maximum stable rank, i.e. squared ratio of
Frobenius and operator norms, of the two matrices being multiplied. This is a
quantitative improvement over previous work of [MZ11, KVZ14], and is also
optimal for any oblivious dimensionality-reducing map. Furthermore, due to the
black box reliance on the subspace embedding property in our proofs, our
theorem can be applied to a much more general class of sketching matrices than
what was known before, in addition to achieving better bounds. For example, one
can apply our theorem to efficient subspace embeddings such as the Subsampled
Randomized Hadamard Transform or sparse subspace embeddings, or even with
subspace embedding constructions that may be developed in the future.
Our main theorem, via connections with spectral error matrix multiplication
shown in prior work, implies quantitative improvements for approximate least
squares regression and low rank approximation. Our main result has also already
been applied to improve dimensionality reduction guarantees for -means
clustering [CEMMP14], and implies new results for nonparametric regression
[YPW15].
We also separately point out that the proof of the "BSS" deterministic
row-sampling result of [BSS12] can be modified to show that for any matrices
of stable rank at most , one can achieve the spectral norm
guarantee for approximate matrix multiplication of by deterministically
sampling rows that can be found in polynomial
time. The original result of [BSS12] was for rank instead of stable rank. Our
observation leads to a stronger version of a main theorem of [KMST10].Comment: v3: minor edits; v2: fixed one step in proof of Theorem 9 which was
wrong by a constant factor (see the new Lemma 5 and its use; final theorem
unaffected
Uniform Sampling for Matrix Approximation
Random sampling has become a critical tool in solving massive matrix
problems. For linear regression, a small, manageable set of data rows can be
randomly selected to approximate a tall, skinny data matrix, improving
processing time significantly. For theoretical performance guarantees, each row
must be sampled with probability proportional to its statistical leverage
score. Unfortunately, leverage scores are difficult to compute.
A simple alternative is to sample rows uniformly at random. While this often
works, uniform sampling will eliminate critical row information for many
natural instances. We take a fresh look at uniform sampling by examining what
information it does preserve. Specifically, we show that uniform sampling
yields a matrix that, in some sense, well approximates a large fraction of the
original. While this weak form of approximation is not enough for solving
linear regression directly, it is enough to compute a better approximation.
This observation leads to simple iterative row sampling algorithms for matrix
approximation that run in input-sparsity time and preserve row structure and
sparsity at all intermediate steps. In addition to an improved understanding of
uniform sampling, our main proof introduces a structural result of independent
interest: we show that every matrix can be made to have low coherence by
reweighting a small subset of its rows
- …