12,065 research outputs found
Asynchronous Parallel Stochastic Gradient Descent - A Numeric Core for Scalable Distributed Machine Learning Algorithms
The implementation of a vast majority of machine learning (ML) algorithms
boils down to solving a numerical optimization problem. In this context,
Stochastic Gradient Descent (SGD) methods have long proven to provide good
results, both in terms of convergence and accuracy. Recently, several
parallelization approaches have been proposed in order to scale SGD to solve
very large ML problems. At their core, most of these approaches are following a
map-reduce scheme. This paper presents a novel parallel updating algorithm for
SGD, which utilizes the asynchronous single-sided communication paradigm.
Compared to existing methods, Asynchronous Parallel Stochastic Gradient Descent
(ASGD) provides faster (or at least equal) convergence, close to linear scaling
and stable accuracy
Message-Passing Algorithms for Quadratic Minimization
Gaussian belief propagation (GaBP) is an iterative algorithm for computing
the mean of a multivariate Gaussian distribution, or equivalently, the minimum
of a multivariate positive definite quadratic function. Sufficient conditions,
such as walk-summability, that guarantee the convergence and correctness of
GaBP are known, but GaBP may fail to converge to the correct solution given an
arbitrary positive definite quadratic function. As was observed in previous
work, the GaBP algorithm fails to converge if the computation trees produced by
the algorithm are not positive definite. In this work, we will show that the
failure modes of the GaBP algorithm can be understood via graph covers, and we
prove that a parameterized generalization of the min-sum algorithm can be used
to ensure that the computation trees remain positive definite whenever the
input matrix is positive definite. We demonstrate that the resulting algorithm
is closely related to other iterative schemes for quadratic minimization such
as the Gauss-Seidel and Jacobi algorithms. Finally, we observe, empirically,
that there always exists a choice of parameters such that the above
generalization of the GaBP algorithm converges
- …