Search CORE

7,037 research outputs found

Randomized dual proximal gradient for large-scale distributed optimization

Author: Notarnicola Ivano
Notarstefano Giuseppe
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

In this paper we consider distributed optimization problems in which the cost function is separable (i.e., a sum of possibly non-smooth functions all sharing a common variable) and can be split into a strongly convex term and a convex one. The second term is typically used to encode constraints or to regularize the solution. We propose an asynchronous, distributed optimization algorithm over an undirected topology, based on a proximal gradient update on the dual problem. We show that by means of a proper choice of primal variables, the dual problem is separable and the dual variables can be stacked into separate blocks. This allows us to show that a distributed gossip update can be obtained by means of a randomized block-coordinate proximal gradient on the dual function

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

An Accelerated Proximal Coordinate Gradient Method and its Application to Regularized Empirical Risk Minimization

Author: Lin Qihang
Lu Zhaosong
Xiao Lin
Publication venue
Publication date: 01/01/2014
Field of study

We consider the problem of minimizing the sum of two convex functions: one is smooth and given by a gradient oracle, and the other is separable over blocks of coordinates and has a simple known structure over each block. We develop an accelerated randomized proximal coordinate gradient (APCG) method for minimizing such convex composite functions. For strongly convex functions, our method achieves faster linear convergence rates than existing randomized proximal coordinate gradient methods. Without strong convexity, our method enjoys accelerated sublinear convergence rates. We show how to apply the APCG method to solve the regularized empirical risk minimization (ERM) problem, and devise efficient implementations that avoid full-dimensional vector operations. For ill-conditioned ERM problems, our method obtains improved convergence rates than the state-of-the-art stochastic dual coordinate ascent (SDCA) method

arXiv.org e-Print Archive

CiteSeerX

A randomized primal distributed algorithm for partitioned and big-data non-convex optimization

Author: Notarnicola Ivano
Notarstefano Giuseppe
Publication venue
Publication date: 24/03/2017
Field of study

In this paper we consider a distributed optimization scenario in which the aggregate objective function to minimize is partitioned, big-data and possibly non-convex. Specifically, we focus on a set-up in which the dimension of the decision variable depends on the network size as well as the number of local functions, but each local function handled by a node depends only on a (small) portion of the entire optimization variable. This problem set-up has been shown to appear in many interesting network application scenarios. As main paper contribution, we develop a simple, primal distributed algorithm to solve the optimization problem, based on a randomized descent approach, which works under asynchronous gossip communication. We prove that the proposed asynchronous algorithm is a proper, ad-hoc version of a coordinate descent method and thus converges to a stationary point. To show the effectiveness of the proposed algorithm, we also present numerical simulations on a non-convex quadratic program, which confirm the theoretical results

arXiv.org e-Print Archive

Crossref

Archivio Istituzionale della Ricerca- Università del Salento

A Coordinate Descent Primal-Dual Algorithm and Application to Distributed Asynchronous Optimization

Author: Bianchi Pascal
Hachem Walid
Iutzeler Franck
Publication venue
Publication date: 30/09/2015
Field of study

Based on the idea of randomized coordinate descent of

\alpha

-averaged operators, a randomized primal-dual optimization algorithm is introduced, where a random subset of coordinates is updated at each iteration. The algorithm builds upon a variant of a recent (deterministic) algorithm proposed by V\~u and Condat that includes the well known ADMM as a particular case. The obtained algorithm is used to solve asynchronously a distributed optimization problem. A network of agents, each having a separate cost function containing a differentiable term, seek to find a consensus on the minimum of the aggregate objective. The method yields an algorithm where at each iteration, a random subset of agents wake up, update their local estimates, exchange some data with their neighbors, and go idle. Numerical results demonstrate the attractive performance of the method. The general approach can be naturally adapted to other situations where coordinate descent convex optimization algorithms are used with a random choice of the coordinates.Comment: 10 page

arXiv.org e-Print Archive

HAL-CentraleSupelec

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

HAL-Rennes 1

An Accelerated Decentralized Stochastic Proximal Algorithm for Finite Sums

Author: Bach Francis
Hendrikx Hadrien
Massoulie Laurent
Publication venue
Publication date: 12/06/2019
Field of study

Modern large-scale finite-sum optimization relies on two key aspects: distribution and stochastic updates. For smooth and strongly convex problems, existing decentralized algorithms are slower than modern accelerated variance-reduced stochastic algorithms when run on a single machine, and are therefore not efficient. Centralized algorithms are fast, but their scaling is limited by global aggregation steps that result in communication bottlenecks. In this work, we propose an efficient \textbf{A}ccelerated \textbf{D}ecentralized stochastic algorithm for \textbf{F}inite \textbf{S}ums named ADFS, which uses local stochastic proximal updates and randomized pairwise communications between nodes. On

n

machines, ADFS learns from

nm

samples in the same time it takes optimal algorithms to learn from

m

samples on one machine. This scaling holds until a critical network size is reached, which depends on communication delays, on the number of samples

m

, and on the network topology. We provide a theoretical analysis based on a novel augmented graph approach combined with a precise evaluation of synchronization times and an extension of the accelerated proximal coordinate gradient algorithm to arbitrary sampling. We illustrate the improvement of ADFS over state-of-the-art decentralized approaches with experiments.Comment: Code available in source files. arXiv admin note: substantial text overlap with arXiv:1901.0986

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server