2 research outputs found

    Weighted SGD for β„“p\ell_p Regression with Randomized Preconditioning

    Full text link
    In recent years, stochastic gradient descent (SGD) methods and randomized linear algebra (RLA) algorithms have been applied to many large-scale problems in machine learning and data analysis. We aim to bridge the gap between these two methods in solving constrained overdetermined linear regression problems---e.g., β„“2\ell_2 and β„“1\ell_1 regression problems. We propose a hybrid algorithm named pwSGD that uses RLA techniques for preconditioning and constructing an importance sampling distribution, and then performs an SGD-like iterative process with weighted sampling on the preconditioned system. We prove that pwSGD inherits faster convergence rates that only depend on the lower dimension of the linear system, while maintaining low computation complexity. Particularly, when solving β„“1\ell_1 regression with size nn by dd, pwSGD returns an approximate solution with Ο΅\epsilon relative error in the objective value in O(log⁑nβ‹…nnz(A)+poly(d)/Ο΅2)\mathcal{O}(\log n \cdot \text{nnz}(A) + \text{poly}(d)/\epsilon^2) time. This complexity is uniformly better than that of RLA methods in terms of both Ο΅\epsilon and dd when the problem is unconstrained. For β„“2\ell_2 regression, pwSGD returns an approximate solution with Ο΅\epsilon relative error in the objective value and the solution vector measured in prediction norm in O(log⁑nβ‹…nnz(A)+poly(d)log⁑(1/Ο΅)/Ο΅)\mathcal{O}(\log n \cdot \text{nnz}(A) + \text{poly}(d) \log(1/\epsilon) /\epsilon) time. We also provide lower bounds on the coreset complexity for more general regression problems, indicating that still new ideas will be needed to extend similar RLA preconditioning ideas to weighted SGD algorithms for more general regression problems. Finally, the effectiveness of such algorithms is illustrated numerically on both synthetic and real datasets.Comment: A conference version of this paper appears under the same title in Proceedings of ACM-SIAM Symposium on Discrete Algorithms, Arlington, VA, 201

    Aligning Points to Lines: Provable Approximations

    Full text link
    We suggest a new optimization technique for minimizing the sum βˆ‘i=1nfi(x)\sum_{i=1}^n f_i(x) of nn non-convex real functions that satisfy a property that we call piecewise log-Lipschitz. This is by forging links between techniques in computational geometry, combinatorics and convex optimization. As an example application, we provide the first constant-factor approximation algorithms whose running-time is polynomial in nn for the fundamental problem of \emph{Points-to-Lines alignment}: Given nn points p1,⋯ ,pnp_1,\cdots,p_n and nn lines β„“1,⋯ ,β„“n\ell_1,\cdots,\ell_n on the plane and z>0z>0, compute the matching Ο€:[n]β†’[n]\pi:[n]\to[n] and alignment (rotation matrix RR and a translation vector tt) that minimize the sum of Euclidean distances βˆ‘i=1ndist(Rpiβˆ’t,β„“Ο€(i))z\sum_{i=1}^n \mathrm{dist}(Rp_i-t,\ell_{\pi(i)})^z between each point to its corresponding line. This problem is non-trivial even if z=1z=1 and the matching Ο€\pi is given. If Ο€\pi is given, the running time of our algorithms is O(n3)O(n^3), and even near-linear in nn using core-sets that support: streaming, dynamic, and distributed parallel computations in poly-logarithmic update time. Generalizations for handling e.g. outliers or pseudo-distances such as MM-estimators for the problem are also provided. Experimental results and open source code show that our provable algorithms improve existing heuristics also in practice. A companion demonstration video in the context of Augmented Reality shows how such algorithms may be used in real-time systems
    corecore