1 research outputs found
Avoiding Communication in Logistic Regression
Stochastic gradient descent (SGD) is one of the most widely used optimization
methods for solving various machine learning problems. SGD solves an
optimization problem by iteratively sampling a few data points from the input
data, computing gradients for the selected data points, and updating the
solution. However, in a parallel setting, SGD requires interprocess
communication at every iteration. We introduce a new communication-avoiding
technique for solving the logistic regression problem using SGD. This technique
re-organizes the SGD computations into a form that communicates every
iterations instead of every iteration, where is a tuning parameter. We
prove theoretical flops, bandwidth, and latency upper bounds for SGD and its
new communication-avoiding variant. Furthermore, we show experimental results
that illustrate that the new Communication-Avoiding SGD (CA-SGD) method can
achieve speedups of up to on a high-performance Infiniband cluster
without altering the convergence behavior or accuracy