Search CORE

1,405 research outputs found

Coordinate Descent with Bandit Sampling

Author: Celis L. Elisa
Salehi Farnood
Thiran Patrick
Publication venue
Publication date: 04/12/2018
Field of study

Coordinate descent methods usually minimize a cost function by updating a random decision variable (corresponding to one coordinate) at a time. Ideally, we would update the decision variable that yields the largest decrease in the cost function. However, finding this coordinate would require checking all of them, which would effectively negate the improvement in computational tractability that coordinate descent is intended to afford. To address this, we propose a new adaptive method for selecting a coordinate. First, we find a lower bound on the amount the cost function decreases when a coordinate is updated. We then use a multi-armed bandit algorithm to learn which coordinates result in the largest lower bound by interleaving this learning with conventional coordinate descent updates except that the coordinate is selected proportionately to the expected decrease. We show that our approach improves the convergence of coordinate descent methods both theoretically and experimentally.Comment: appearing at NeurIPS 201

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Supervised Learning Under Distributed Features

Author: Sayed Ali H.
Ying Bicheng
Yuan Kun
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/01/2019
Field of study

This work studies the problem of learning under both large datasets and large-dimensional feature space scenarios. The feature information is assumed to be spread across agents in a network, where each agent observes some of the features. Through local cooperation, the agents are supposed to interact with each other to solve an inference problem and converge towards the global minimizer of an empirical risk. We study this problem exclusively in the primal domain, and propose new and effective distributed solutions with guaranteed convergence to the minimizer with linear rate under strong convexity. This is achieved by combining a dynamic diffusion construction, a pipeline strategy, and variance-reduced techniques. Simulation results illustrate the conclusions

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Dual-Free Stochastic Decentralized Optimization with Variance Reduction

Author: Bach Francis
Hendrikx Hadrien
Massoulié Laurent
Publication venue
Publication date: 25/06/2020
Field of study

We consider the problem of training machine learning models on distributed data in a decentralized way. For finite-sum problems, fast single-machine algorithms for large datasets rely on stochastic updates combined with variance reduction. Yet, existing decentralized stochastic algorithms either do not obtain the full speedup allowed by stochastic updates, or require oracles that are more expensive than regular gradients. In this work, we introduce a Decentralized stochastic algorithm with Variance Reduction called DVR. DVR only requires computing stochastic gradients of the local functions, and is computationally as fast as a standard stochastic variance-reduced algorithms run on a

1/n

fraction of the dataset, where

n

is the number of nodes. To derive DVR, we use Bregman coordinate descent on a well-chosen dual problem, and obtain a dual-free algorithm using a specific Bregman divergence. We give an accelerated version of DVR based on the Catalyst framework, and illustrate its effectiveness with simulations on real data

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Distributed Dual Coordinate Ascent with Imbalanced Data on a General Tree Network

Author: Cho Myung
Lai Lifeng
Xu Weiyu
Publication venue
Publication date: 28/08/2023
Field of study

In this paper, we investigate the impact of imbalanced data on the convergence of distributed dual coordinate ascent in a tree network for solving an empirical loss minimization problem in distributed machine learning. To address this issue, we propose a method called delayed generalized distributed dual coordinate ascent that takes into account the information of the imbalanced data, and provide the analysis of the proposed algorithm. Numerical experiments confirm the effectiveness of our proposed method in improving the convergence speed of distributed dual coordinate ascent in a tree network.Comment: To be published in IEEE 2023 Workshop on Machine Learning for Signal Processing (MLSP

arXiv.org e-Print Archive

Communication-Efficient Distributed Dual Coordinate Ascent.

Author: Hofmann Thomas
Jaggi Martin
Jordan Michael I.
Krishnan Sanjay
Smith Virginia
Takác Martin
Terhorst Jonathan
Publication venue
Publication date: 21/06/2017
Field of study

Infoscience - École polytechnique fédérale de Lausanne