30 research outputs found
Input Sparsity and Hardness for Robust Subspace Approximation
In the subspace approximation problem, we seek a k-dimensional subspace F of
R^d that minimizes the sum of p-th powers of Euclidean distances to a given set
of n points a_1, ..., a_n in R^d, for p >= 1. More generally than minimizing
sum_i dist(a_i,F)^p,we may wish to minimize sum_i M(dist(a_i,F)) for some loss
function M(), for example, M-Estimators, which include the Huber and Tukey loss
functions. Such subspaces provide alternatives to the singular value
decomposition (SVD), which is the p=2 case, finding such an F that minimizes
the sum of squares of distances. For p in [1,2), and for typical M-Estimators,
the minimizing gives a solution that is more robust to outliers than that
provided by the SVD. We give several algorithmic and hardness results for these
robust subspace approximation problems.
We think of the n points as forming an n x d matrix A, and letting nnz(A)
denote the number of non-zero entries of A. Our results hold for p in [1,2). We
use poly(n) to denote n^{O(1)} as n -> infty. We obtain: (1) For minimizing
sum_i dist(a_i,F)^p, we give an algorithm running in O(nnz(A) +
(n+d)poly(k/eps) + exp(poly(k/eps))), (2) we show that the problem of
minimizing sum_i dist(a_i, F)^p is NP-hard, even to output a
(1+1/poly(d))-approximation, answering a question of Kannan and Vempala, and
complementing prior results which held for p >2, (3) For loss functions for a
wide class of M-Estimators, we give a problem-size reduction: for a parameter
K=(log n)^{O(log k)}, our reduction takes O(nnz(A) log n + (n+d) poly(K/eps))
time to reduce the problem to a constrained version involving matrices whose
dimensions are poly(K eps^{-1} log n). We also give bicriteria solutions, (4)
Our techniques lead to the first O(nnz(A) + poly(d/eps)) time algorithms for
(1+eps)-approximate regression for a wide class of convex M-Estimators.Comment: paper appeared in FOCS, 201
Sharpened Lazy Incremental Quasi-Newton Method
We consider the finite sum minimization of strongly convex and smooth
functions with Lipschitz continuous Hessians in dimensions. In many
applications where such problems arise, including maximum likelihood
estimation, empirical risk minimization, and unsupervised learning, the number
of observations is large, and it becomes necessary to use incremental or
stochastic algorithms whose per-iteration complexity is independent of . Of
these, the incremental/stochastic variants of the Newton method exhibit
superlinear convergence, but incur a per-iteration complexity of ,
which may be prohibitive in large-scale settings. On the other hand, the
incremental Quasi-Newton method incurs a per-iteration complexity of
but its superlinear convergence rate has only been characterized
asymptotically. This work puts forth the Sharpened Lazy Incremental
Quasi-Newton (SLIQN) method that achieves the best of both worlds: an explicit
superlinear convergence rate with a per-iteration complexity of .
Building upon the recently proposed Sharpened Quasi-Newton method, the proposed
incremental variant incorporates a hybrid update strategy incorporating both
classic and greedy BFGS updates. The proposed lazy update rule distributes the
computational complexity between the iterations, so as to enable a
per-iteration complexity of . Numerical tests demonstrate the
superiority of SLIQN over all other incremental and stochastic Quasi-Newton
variants.Comment: 39 pages, 3 figure