Search CORE

31,503 research outputs found

Variance-Reduced Stochastic Learning by Networked Agents under Random Reshuffling

Author: Liu Jiageng
Sayed Ali H.
Ying Bicheng
Yuan Kun
Publication venue
Publication date: 29/05/2018
Field of study

A new amortized variance-reduced gradient (AVRG) algorithm was developed in \cite{ying2017convergence}, which has constant storage requirement in comparison to SAGA and balanced gradient computations in comparison to SVRG. One key advantage of the AVRG strategy is its amenability to decentralized implementations. In this work, we show how AVRG can be extended to the network case where multiple learning agents are assumed to be connected by a graph topology. In this scenario, each agent observes data that is spatially distributed and all agents are only allowed to communicate with direct neighbors. Moreover, the amount of data observed by the individual agents may differ drastically. For such situations, the balanced gradient computation property of AVRG becomes a real advantage in reducing idle time caused by unbalanced local data storage requirements, which is characteristic of other reduced-variance gradient algorithms. The resulting diffusion-AVRG algorithm is shown to have linear convergence to the exact solution, and is much more memory efficient than other alternative algorithms. In addition, we propose a mini-batch strategy to balance the communication and computation efficiency for diffusion-AVRG. When a proper batch size is employed, it is observed in simulations that diffusion-AVRG is more computationally efficient than exact diffusion or EXTRA while maintaining almost the same communication efficiency.Comment: 23 pages, 12 figures, submitted for publicatio

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

D $^2$ : Decentralized Training over Decentralized Data

Author: Lian Xiangru
Liu Ji
Tang Hanlin
Yan Ming
Zhang Ce
Publication venue
Publication date: 01/01/2018
Field of study

While training a machine learning model using multiple workers, each of which collects data from their own data sources, it would be most useful when the data collected from different workers can be {\em unique} and {\em different}. Ironically, recent analysis of decentralized parallel stochastic gradient descent (D-PSGD) relies on the assumption that the data hosted on different workers are {\em not too different}. In this paper, we ask the question: {\em Can we design a decentralized parallel stochastic gradient descent algorithm that is less sensitive to the data variance across workers?} In this paper, we present D

^2

, a novel decentralized parallel stochastic gradient descent algorithm designed for large data variance \xr{among workers} (imprecisely, "decentralized" data). The core of D

^2

is a variance blackuction extension of the standard D-PSGD algorithm, which improves the convergence rate from

O\left({\sigma \over \sqrt{nT}} + {(n\zeta^2)^{\frac{1}{3}} \over T^{2/3}}\right)

O\left({\sigma \over \sqrt{nT}}\right)

where

\zeta^{2}

denotes the variance among data on different workers. As a result, D

^2

is robust to data variance among workers. We empirically evaluated D

^2

on image classification tasks where each worker has access to only the data of a limited set of labels, and find that D

^2

significantly outperforms D-PSGD

arXiv.org e-Print Archive

Repository for Publications and Research Data

Distributed Deblurring of Large Images of Wide Field-Of-View

Author: Bianchi Pascal
Ferrari André
Flamary Rémi
Mourya Rahul
Richard Cédric
Publication venue
Publication date: 15/05/2017
Field of study

Image deblurring is an economic way to reduce certain degradations (blur and noise) in acquired images. Thus, it has become essential tool in high resolution imaging in many applications, e.g., astronomy, microscopy or computational photography. In applications such as astronomy and satellite imaging, the size of acquired images can be extremely large (up to gigapixels) covering wide field-of-view suffering from shift-variant blur. Most of the existing image deblurring techniques are designed and implemented to work efficiently on centralized computing system having multiple processors and a shared memory. Thus, the largest image that can be handle is limited by the size of the physical memory available on the system. In this paper, we propose a distributed nonblind image deblurring algorithm in which several connected processing nodes (with reasonable computational resources) process simultaneously different portions of a large image while maintaining certain coherency among them to finally obtain a single crisp image. Unlike the existing centralized techniques, image deblurring in distributed fashion raises several issues. To tackle these issues, we consider certain approximations that trade-offs between the quality of deblurred image and the computational resources required to achieve it. The experimental results show that our algorithm produces the similar quality of images as the existing centralized techniques while allowing distribution, and thus being cost effective for extremely large images.Comment: 16 pages, 10 figures, submitted to IEEE Trans. on Image Processin

arXiv.org e-Print Archive

HAL-UNICE

HAL-INSU

HAL Descartes