Search CORE

313 research outputs found

An Accelerated Decentralized Stochastic Proximal Algorithm for Finite Sums

Author: Bach Francis
Hendrikx Hadrien
Massoulie Laurent
Publication venue
Publication date: 12/06/2019
Field of study

n

machines, ADFS learns from

nm

samples in the same time it takes optimal algorithms to learn from

m

samples on one machine. This scaling holds until a critical network size is reached, which depends on communication delays, on the number of samples

m

, and on the network topology. We provide a theoretical analysis based on a novel augmented graph approach combined with a precise evaluation of synchronization times and an extension of the accelerated proximal coordinate gradient algorithm to arbitrary sampling. We illustrate the improvement of ADFS over state-of-the-art decentralized approaches with experiments.Comment: Code available in source files. arXiv admin note: substantial text overlap with arXiv:1901.0986

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

An Accelerated Decentralized Stochastic Proximal Algorithm for Finite Sums

Author: Bach Francis
Hendrikx Hadrien
Massoulié Laurent
Publication venue: HAL CCSD
Publication date: 06/09/2019
Field of study

Modern large-scale finite-sum optimization relies on two key aspects: distribution and stochastic updates. For smooth and strongly convex problems, existing decentralized algorithms are slower than modern accelerated variance-reduced stochastic algorithms when run on a single machine, and are therefore not efficient. Centralized algorithms are fast, but their scaling is limited by global aggregation steps that result in communication bottlenecks. In this work, we propose an efficient Accelerated Decentralized stochastic algorithm for Finite Sums named ADFS, which uses local stochastic proximal updates and randomized pairwise communications between nodes. On n machines, ADFS learns from nm samples in the same time it takes optimal algorithms to learn from m samples on one machine. This scaling holds until a critical network size is reached, which depends on communication delays, on the number of samples m, and on the network topology. We provide a theoretical analysis based on a novel augmented graph approach combined with a precise evaluation of synchronization times and an extension of the accelerated proximal coordinate gradient algorithm to arbitrary sampling. We illustrate the improvement of ADFS over state-of-the-art decentralized approaches with experiments

Dual-Free Stochastic Decentralized Optimization with Variance Reduction

Author: Bach Francis
Hendrikx Hadrien
Massoulié Laurent
Publication venue
Publication date: 25/06/2020
Field of study

We consider the problem of training machine learning models on distributed data in a decentralized way. For finite-sum problems, fast single-machine algorithms for large datasets rely on stochastic updates combined with variance reduction. Yet, existing decentralized stochastic algorithms either do not obtain the full speedup allowed by stochastic updates, or require oracles that are more expensive than regular gradients. In this work, we introduce a Decentralized stochastic algorithm with Variance Reduction called DVR. DVR only requires computing stochastic gradients of the local functions, and is computationally as fast as a standard stochastic variance-reduced algorithms run on a

1/n

fraction of the dataset, where

n

is the number of nodes. To derive DVR, we use Bregman coordinate descent on a well-chosen dual problem, and obtain a dual-free algorithm using a specific Bregman divergence. We give an accelerated version of DVR based on the Catalyst framework, and illustrate its effectiveness with simulations on real data

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server