127 research outputs found
Input Sparsity and Hardness for Robust Subspace Approximation
In the subspace approximation problem, we seek a k-dimensional subspace F of
R^d that minimizes the sum of p-th powers of Euclidean distances to a given set
of n points a_1, ..., a_n in R^d, for p >= 1. More generally than minimizing
sum_i dist(a_i,F)^p,we may wish to minimize sum_i M(dist(a_i,F)) for some loss
function M(), for example, M-Estimators, which include the Huber and Tukey loss
functions. Such subspaces provide alternatives to the singular value
decomposition (SVD), which is the p=2 case, finding such an F that minimizes
the sum of squares of distances. For p in [1,2), and for typical M-Estimators,
the minimizing gives a solution that is more robust to outliers than that
provided by the SVD. We give several algorithmic and hardness results for these
robust subspace approximation problems.
We think of the n points as forming an n x d matrix A, and letting nnz(A)
denote the number of non-zero entries of A. Our results hold for p in [1,2). We
use poly(n) to denote n^{O(1)} as n -> infty. We obtain: (1) For minimizing
sum_i dist(a_i,F)^p, we give an algorithm running in O(nnz(A) +
(n+d)poly(k/eps) + exp(poly(k/eps))), (2) we show that the problem of
minimizing sum_i dist(a_i, F)^p is NP-hard, even to output a
(1+1/poly(d))-approximation, answering a question of Kannan and Vempala, and
complementing prior results which held for p >2, (3) For loss functions for a
wide class of M-Estimators, we give a problem-size reduction: for a parameter
K=(log n)^{O(log k)}, our reduction takes O(nnz(A) log n + (n+d) poly(K/eps))
time to reduce the problem to a constrained version involving matrices whose
dimensions are poly(K eps^{-1} log n). We also give bicriteria solutions, (4)
Our techniques lead to the first O(nnz(A) + poly(d/eps)) time algorithms for
(1+eps)-approximate regression for a wide class of convex M-Estimators.Comment: paper appeared in FOCS, 201
Self-improving Algorithms for Coordinate-wise Maxima
Computing the coordinate-wise maxima of a planar point set is a classic and
well-studied problem in computational geometry. We give an algorithm for this
problem in the \emph{self-improving setting}. We have (unknown) independent
distributions \cD_1, \cD_2, ..., \cD_n of planar points. An input pointset
is generated by taking an independent sample from
each \cD_i, so the input distribution \cD is the product \prod_i \cD_i. A
self-improving algorithm repeatedly gets input sets from the distribution \cD
(which is \emph{a priori} unknown) and tries to optimize its running time for
\cD. Our algorithm uses the first few inputs to learn salient features of the
distribution, and then becomes an optimal algorithm for distribution \cD. Let
\OPT_\cD denote the expected depth of an \emph{optimal} linear comparison
tree computing the maxima for distribution \cD. Our algorithm eventually has
an expected running time of O(\text{OPT}_\cD + n), even though it did not
know \cD to begin with.
Our result requires new tools to understand linear comparison trees for
computing maxima. We show how to convert general linear comparison trees to
very restricted versions, which can then be related to the running time of our
algorithm. An interesting feature of our algorithm is an interleaved search,
where the algorithm tries to determine the likeliest point to be maximal with
minimal computation. This allows the running time to be truly optimal for the
distribution \cD.Comment: To appear in Symposium of Computational Geometry 2012 (17 pages, 2
figures
Self-Improving Algorithms
We investigate ways in which an algorithm can improve its expected
performance by fine-tuning itself automatically with respect to an unknown
input distribution D. We assume here that D is of product type. More precisely,
suppose that we need to process a sequence I_1, I_2, ... of inputs I = (x_1,
x_2, ..., x_n) of some fixed length n, where each x_i is drawn independently
from some arbitrary, unknown distribution D_i. The goal is to design an
algorithm for these inputs so that eventually the expected running time will be
optimal for the input distribution D = D_1 * D_2 * ... * D_n.
We give such self-improving algorithms for two problems: (i) sorting a
sequence of numbers and (ii) computing the Delaunay triangulation of a planar
point set. Both algorithms achieve optimal expected limiting complexity. The
algorithms begin with a training phase during which they collect information
about the input distribution, followed by a stationary regime in which the
algorithms settle to their optimized incarnations.Comment: 26 pages, 8 figures, preliminary versions appeared at SODA 2006 and
SoCG 2008. Thorough revision to improve the presentation of the pape
Capacity Analysis of Vector Symbolic Architectures
Hyperdimensional computing (HDC) is a biologically-inspired framework which
represents symbols with high-dimensional vectors, and uses vector operations to
manipulate them. The ensemble of a particular vector space and a prescribed set
of vector operations (including one addition-like for "bundling" and one
outer-product-like for "binding") form a *vector symbolic architecture* (VSA).
While VSAs have been employed in numerous applications and have been studied
empirically, many theoretical questions about VSAs remain open. We analyze the
*representation capacities* of four common VSAs: MAP-I, MAP-B, and two VSAs
based on sparse binary vectors. "Representation capacity' here refers to bounds
on the dimensions of the VSA vectors required to perform certain symbolic
tasks, such as testing for set membership and estimating set
intersection sizes for two sets of symbols and , to a given
degree of accuracy. We also analyze the ability of a novel variant of a
Hopfield network (a simple model of associative memory) to perform some of the
same tasks that are typically asked of VSAs. In addition to providing new
bounds on VSA capacities, our analyses establish and leverage connections
between VSAs, "sketching" (dimensionality reduction) algorithms, and Bloom
filters
Sharper Bounds for Regularized Data Fitting
We study matrix sketching methods for regularized variants of linear regression, low rank approximation, and canonical correlation analysis. Our main focus is on sketching techniques which preserve the objective function value for regularized problems, which is an area that has remained largely unexplored. We study regularization both in a fairly broad setting, and in the specific context of the popular and widely used technique of ridge regularization; for the latter, as applied to each of these problems, we show algorithmic resource bounds in which the statistical dimension appears in places where in previous bounds the rank would appear. The statistical dimension is always smaller than the rank, and decreases as the amount of regularization increases. In particular we show this for the ridge low-rank approximation problem as well as regularized low-rank approximation problems in a much more general setting, where the regularizing function satisfies some very general conditions (chiefly, invariance under orthogonal transformations)
- …