Search CORE

5,964 research outputs found

Stochastic Optimization of PCA with Capped MSG

Author: Arora Raman
Cotter Andrew
Srebro Nathan
Publication venue
Publication date: 05/07/2013
Field of study

We study PCA as a stochastic optimization problem and propose a novel stochastic approximation algorithm which we refer to as "Matrix Stochastic Gradient" (MSG), as well as a practical variant, Capped MSG. We study the method both theoretically and empirically

arXiv.org e-Print Archive

CiteSeerX

Stochastic Subgradient Algorithms for Strongly Convex Optimization over Distributed Networks

Author: Muhammed O. Sayin
N. Denizcan Vanli
Senior Member
Suleyman S. Kozat
Publication venue
Publication date: 31/08/2015
Field of study

We study diffusion and consensus based optimization of a sum of unknown convex objective functions over distributed networks. The only access to these functions is through stochastic gradient oracles, each of which is only available at a different node, and a limited number of gradient oracle calls is allowed at each node. In this framework, we introduce a convex optimization algorithm based on the stochastic gradient descent (SGD) updates. Particularly, we use a carefully designed time-dependent weighted averaging of the SGD iterates, which yields a convergence rate of

O\left(\frac{N\sqrt{N}}{T}\right)

after

T

gradient updates for each node on a network of

N

nodes. We then show that after

T

gradient oracle calls, the average SGD iterate achieves a mean square deviation (MSD) of

O\left(\frac{\sqrt{N}}{T}\right)

. This rate of convergence is optimal as it matches the performance lower bound up to constant terms. Similar to the SGD algorithm, the computational complexity of the proposed algorithm also scales linearly with the dimensionality of the data. Furthermore, the communication load of the proposed method is the same as the communication load of the SGD algorithm. Thus, the proposed algorithm is highly efficient in terms of complexity and communication load. We illustrate the merits of the algorithm with respect to the state-of-art methods over benchmark real life data sets and widely studied network topologies

arXiv.org e-Print Archive

CiteSeerX

Block-Coordinate Frank-Wolfe Optimization for Structural SVMs

Author: Jaggi Martin
Lacoste-Julien Simon
Pletscher Patrick
Schmidt Mark
Publication venue
Publication date: 01/01/2013
Field of study

We propose a randomized block-coordinate variant of the classic Frank-Wolfe algorithm for convex optimization with block-separable constraints. Despite its lower iteration cost, we show that it achieves a similar convergence rate in duality gap as the full Frank-Wolfe algorithm. We also show that, when applied to the dual structural support vector machine (SVM) objective, this yields an online algorithm that has the same low iteration complexity as primal stochastic subgradient methods. However, unlike stochastic subgradient methods, the block-coordinate Frank-Wolfe algorithm allows us to compute the optimal step-size and yields a computable duality gap guarantee. Our experiments indicate that this simple algorithm outperforms competing structural SVM solvers.Comment: Appears in Proceedings of the 30th International Conference on Machine Learning (ICML 2013). 9 pages main text + 22 pages appendix. Changes from v3 to v4: 1) Re-organized appendix; improved & clarified duality gap proofs; re-drew all plots; 2) Changed convention for Cf definition; 3) Added weighted averaging experiments + convergence results; 4) Clarified main text and relationship with appendi

arXiv.org e-Print Archive

CiteSeerX

INRIA a CCSD electronic archive server

HAL-Polytechnique