127 research outputs found

    Input Sparsity and Hardness for Robust Subspace Approximation

    Full text link
    In the subspace approximation problem, we seek a k-dimensional subspace F of R^d that minimizes the sum of p-th powers of Euclidean distances to a given set of n points a_1, ..., a_n in R^d, for p >= 1. More generally than minimizing sum_i dist(a_i,F)^p,we may wish to minimize sum_i M(dist(a_i,F)) for some loss function M(), for example, M-Estimators, which include the Huber and Tukey loss functions. Such subspaces provide alternatives to the singular value decomposition (SVD), which is the p=2 case, finding such an F that minimizes the sum of squares of distances. For p in [1,2), and for typical M-Estimators, the minimizing FF gives a solution that is more robust to outliers than that provided by the SVD. We give several algorithmic and hardness results for these robust subspace approximation problems. We think of the n points as forming an n x d matrix A, and letting nnz(A) denote the number of non-zero entries of A. Our results hold for p in [1,2). We use poly(n) to denote n^{O(1)} as n -> infty. We obtain: (1) For minimizing sum_i dist(a_i,F)^p, we give an algorithm running in O(nnz(A) + (n+d)poly(k/eps) + exp(poly(k/eps))), (2) we show that the problem of minimizing sum_i dist(a_i, F)^p is NP-hard, even to output a (1+1/poly(d))-approximation, answering a question of Kannan and Vempala, and complementing prior results which held for p >2, (3) For loss functions for a wide class of M-Estimators, we give a problem-size reduction: for a parameter K=(log n)^{O(log k)}, our reduction takes O(nnz(A) log n + (n+d) poly(K/eps)) time to reduce the problem to a constrained version involving matrices whose dimensions are poly(K eps^{-1} log n). We also give bicriteria solutions, (4) Our techniques lead to the first O(nnz(A) + poly(d/eps)) time algorithms for (1+eps)-approximate regression for a wide class of convex M-Estimators.Comment: paper appeared in FOCS, 201

    Self-improving Algorithms for Coordinate-wise Maxima

    Full text link
    Computing the coordinate-wise maxima of a planar point set is a classic and well-studied problem in computational geometry. We give an algorithm for this problem in the \emph{self-improving setting}. We have nn (unknown) independent distributions \cD_1, \cD_2, ..., \cD_n of planar points. An input pointset (p1,p2,...,pn)(p_1, p_2, ..., p_n) is generated by taking an independent sample pip_i from each \cD_i, so the input distribution \cD is the product \prod_i \cD_i. A self-improving algorithm repeatedly gets input sets from the distribution \cD (which is \emph{a priori} unknown) and tries to optimize its running time for \cD. Our algorithm uses the first few inputs to learn salient features of the distribution, and then becomes an optimal algorithm for distribution \cD. Let \OPT_\cD denote the expected depth of an \emph{optimal} linear comparison tree computing the maxima for distribution \cD. Our algorithm eventually has an expected running time of O(\text{OPT}_\cD + n), even though it did not know \cD to begin with. Our result requires new tools to understand linear comparison trees for computing maxima. We show how to convert general linear comparison trees to very restricted versions, which can then be related to the running time of our algorithm. An interesting feature of our algorithm is an interleaved search, where the algorithm tries to determine the likeliest point to be maximal with minimal computation. This allows the running time to be truly optimal for the distribution \cD.Comment: To appear in Symposium of Computational Geometry 2012 (17 pages, 2 figures

    Self-Improving Algorithms

    Full text link
    We investigate ways in which an algorithm can improve its expected performance by fine-tuning itself automatically with respect to an unknown input distribution D. We assume here that D is of product type. More precisely, suppose that we need to process a sequence I_1, I_2, ... of inputs I = (x_1, x_2, ..., x_n) of some fixed length n, where each x_i is drawn independently from some arbitrary, unknown distribution D_i. The goal is to design an algorithm for these inputs so that eventually the expected running time will be optimal for the input distribution D = D_1 * D_2 * ... * D_n. We give such self-improving algorithms for two problems: (i) sorting a sequence of numbers and (ii) computing the Delaunay triangulation of a planar point set. Both algorithms achieve optimal expected limiting complexity. The algorithms begin with a training phase during which they collect information about the input distribution, followed by a stationary regime in which the algorithms settle to their optimized incarnations.Comment: 26 pages, 8 figures, preliminary versions appeared at SODA 2006 and SoCG 2008. Thorough revision to improve the presentation of the pape

    Capacity Analysis of Vector Symbolic Architectures

    Full text link
    Hyperdimensional computing (HDC) is a biologically-inspired framework which represents symbols with high-dimensional vectors, and uses vector operations to manipulate them. The ensemble of a particular vector space and a prescribed set of vector operations (including one addition-like for "bundling" and one outer-product-like for "binding") form a *vector symbolic architecture* (VSA). While VSAs have been employed in numerous applications and have been studied empirically, many theoretical questions about VSAs remain open. We analyze the *representation capacities* of four common VSAs: MAP-I, MAP-B, and two VSAs based on sparse binary vectors. "Representation capacity' here refers to bounds on the dimensions of the VSA vectors required to perform certain symbolic tasks, such as testing for set membership i∈Si \in S and estimating set intersection sizes ∣X∩Y∣|X \cap Y| for two sets of symbols XX and YY, to a given degree of accuracy. We also analyze the ability of a novel variant of a Hopfield network (a simple model of associative memory) to perform some of the same tasks that are typically asked of VSAs. In addition to providing new bounds on VSA capacities, our analyses establish and leverage connections between VSAs, "sketching" (dimensionality reduction) algorithms, and Bloom filters

    Sharper Bounds for Regularized Data Fitting

    Get PDF
    We study matrix sketching methods for regularized variants of linear regression, low rank approximation, and canonical correlation analysis. Our main focus is on sketching techniques which preserve the objective function value for regularized problems, which is an area that has remained largely unexplored. We study regularization both in a fairly broad setting, and in the specific context of the popular and widely used technique of ridge regularization; for the latter, as applied to each of these problems, we show algorithmic resource bounds in which the statistical dimension appears in places where in previous bounds the rank would appear. The statistical dimension is always smaller than the rank, and decreases as the amount of regularization increases. In particular we show this for the ridge low-rank approximation problem as well as regularized low-rank approximation problems in a much more general setting, where the regularizing function satisfies some very general conditions (chiefly, invariance under orthogonal transformations)
    • …