Search CORE

5,240 research outputs found

Asynchronous Distributed Semi-Stochastic Gradient Optimization

Author: Kwok James T.
Zhang Ruiliang
Zheng Shuai
Publication venue
Publication date: 04/12/2015
Field of study

With the recent proliferation of large-scale learning problems,there have been a lot of interest on distributed machine learning algorithms, particularly those that are based on stochastic gradient descent (SGD) and its variants. However, existing algorithms either suffer from slow convergence due to the inherent variance of stochastic gradients, or have a fast linear convergence rate but at the expense of poorer solution quality. In this paper, we combine their merits by proposing a fast distributed asynchronous SGD-based algorithm with variance reduction. A constant learning rate can be used, and it is also guaranteed to converge linearly to the optimal solution. Experiments on the Google Cloud Computing Platform demonstrate that the proposed algorithm outperforms state-of-the-art distributed asynchronous algorithms in terms of both wall clock time and solution quality

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Second-Order Stochastic Optimization for Machine Learning in Linear Time

Author: Agarwal Naman
Bullins Brian
Hazan Elad
Publication venue
Publication date: 01/11/2017
Field of study

First-order stochastic methods are the state-of-the-art in large-scale machine learning optimization owing to efficient per-iteration complexity. Second-order methods, while able to provide faster convergence, have been much less explored due to the high cost of computing the second-order information. In this paper we develop second-order stochastic methods for optimization problems in machine learning that match the per-iteration cost of gradient based methods, and in certain settings improve upon the overall running time over popular first-order methods. Furthermore, our algorithm has the desirable property of being implementable in time linear in the sparsity of the input data

arXiv.org e-Print Archive

Princeton University Open Access Repository