6,590 research outputs found
Asynchronous Decentralized Parallel Stochastic Gradient Descent
Most commonly used distributed machine learning systems are either
synchronous or centralized asynchronous. Synchronous algorithms like
AllReduce-SGD perform poorly in a heterogeneous environment, while asynchronous
algorithms using a parameter server suffer from 1) communication bottleneck at
parameter servers when workers are many, and 2) significantly worse convergence
when the traffic to parameter server is congested. Can we design an algorithm
that is robust in a heterogeneous environment, while being communication
efficient and maintaining the best-possible convergence rate? In this paper, we
propose an asynchronous decentralized stochastic gradient decent algorithm
(AD-PSGD) satisfying all above expectations. Our theoretical analysis shows
AD-PSGD converges at the optimal rate as SGD and has linear
speedup w.r.t. number of workers. Empirically, AD-PSGD outperforms the best of
decentralized parallel SGD (D-PSGD), asynchronous parallel SGD (A-PSGD), and
standard data parallel SGD (AllReduce-SGD), often by orders of magnitude in a
heterogeneous environment. When training ResNet-50 on ImageNet with up to 128
GPUs, AD-PSGD converges (w.r.t epochs) similarly to the AllReduce-SGD, but each
epoch can be up to 4-8X faster than its synchronous counterparts in a
network-sharing HPC environment. To the best of our knowledge, AD-PSGD is the
first asynchronous algorithm that achieves a similar epoch-wise convergence
rate as AllReduce-SGD, at an over 100-GPU scale
An Asynchronous, Decentralized Solution Framework for the Large Scale Unit Commitment Problem
With increased reliance on cyber infrastructure, large scale power networks
face new challenges owing to computational scalability. In this paper we focus
on developing an asynchronous decentralized solution framework for the Unit
Commitment(UC) problem for large scale power networks. We exploit the inherent
asynchrony in a region based decomposition arising out of imbalance in regional
subproblems to boost computational efficiency. A two phase algorithm is
proposed that relies on the convex relaxation and privacy preserving valid
inequalities in order to deliver algorithmic improvements. Our algorithm
employs a novel interleaved binary mechanism that locally switches from the
convex subproblem to its binary counterpart based on consistent local
convergent behavior. We develop a high performance computing (HPC) oriented
software framework that uses Message Passing Interface (MPI) to drive our
benchmark studies. Our simulations performed on the IEEE 3012 bus case are
benchmarked against the centralized and a state of the art synchronous
decentralized method. The results demonstrate that the asynchronous method
improves computational efficiency by a significant amount and provides a
competitive solution quality rivaling the benchmark methods
Asynchronous Decentralized Stochastic Optimization in Heterogeneous Networks
We consider expected risk minimization in multi-agent systems comprised of
distinct subsets of agents operating without a common time-scale. Each
individual in the network is charged with minimizing the global objective
function, which is an average of sum of the statistical average loss function
of each agent in the network. Since agents are not assumed to observe data from
identical distributions, the hypothesis that all agents seek a common action is
violated, and thus the hypothesis upon which consensus constraints are
formulated is violated. Thus, we consider nonlinear network proximity
constraints which incentivize nearby nodes to make decisions which are close to
one another but not necessarily coincide. Moreover, agents are not assumed to
receive their sequentially arriving observations on a common time index, and
thus seek to learn in an asynchronous manner. An asynchronous stochastic
variant of the Arrow-Hurwicz saddle point method is proposed to solve this
problem which operates by alternating primal stochastic descent steps and
Lagrange multiplier updates which penalize the discrepancies between agents.
This tool leads to an implementation that allows for each agent to operate
asynchronously with local information only and message passing with neighbors.
Our main result establishes that the proposed method yields convergence in
expectation both in terms of the primal sub-optimality and constraint violation
to radii of sizes and ,
respectively. Empirical evaluation on an asynchronously operating wireless
network that manages user channel interference through an adaptive
communications pricing mechanism demonstrates that our theoretical results
translates well to practice
Decentralized Dynamic Optimization for Power Network Voltage Control
Voltage control in power distribution networks has been greatly challenged by
the increasing penetration of volatile and intermittent devices. These devices
can also provide limited reactive power resources that can be used to regulate
the network-wide voltage. A decentralized voltage control strategy can be
designed by minimizing a quadratic voltage mismatch error objective using
gradient-projection (GP) updates. Coupled with the power network flow, the
local voltage can provide the instantaneous gradient information. This paper
aims to analyze the performance of this decentralized GP-based voltage control
design under two dynamic scenarios: i) the nodes perform the decentralized
update in an asynchronous fashion, and ii) the network operating condition is
time-varying. For the asynchronous voltage control, we improve the existing
convergence condition by recognizing that the voltage based gradient is always
up-to-date. By modeling the network dynamics using an autoregressive process
and considering time-varying resource constraints, we provide an error bound in
tracking the instantaneous optimal solution to the quadratic error objective.
This result can be extended to more general \textit{constrained dynamic
optimization} problems with smooth strongly convex objective functions under
stochastic processes that have bounded iterative changes. Extensive numerical
tests have been performed to demonstrate and validate our analytical results
for realistic power networks
Scalable Deep Learning on Distributed Infrastructures: Challenges, Techniques and Tools
Deep Learning (DL) has had an immense success in the recent past, leading to
state-of-the-art results in various domains such as image recognition and
natural language processing. One of the reasons for this success is the
increasing size of DL models and the proliferation of vast amounts of training
data being available. To keep on improving the performance of DL, increasing
the scalability of DL systems is necessary. In this survey, we perform a broad
and thorough investigation on challenges, techniques and tools for scalable DL
on distributed infrastructures. This incorporates infrastructures for DL,
methods for parallel DL training, multi-tenant resource scheduling and the
management of training and model data. Further, we analyze and compare 11
current open-source DL frameworks and tools and investigate which of the
techniques are commonly implemented in practice. Finally, we highlight future
research trends in DL systems that deserve further research.Comment: accepted at ACM Computing Surveys, to appea
Asynchronous Distributed Optimization with Heterogeneous Regularizations and Normalizations
As multi-agent networks grow in size and scale, they become increasingly
difficult to synchronize, though agents must work together even when generating
and sharing different information at different times. Targeting such cases,
this paper presents an asynchronous optimization framework in which the time
between successive communications and computations is unknown and unspecified
for each agent. Agents' updates are carried out in blocks, with each agent
updating only a small subset of all decision variables. To provide robustness
to asynchrony, each agent uses an independently chosen Tikhonov regularization.
Convergence is measured with respect to a weighted block-maximum norm in which
convergence of agents' blocks can be measured in different p-norms and weighted
differently to heterogeneously normalize problems. Asymptotic convergence is
shown and convergence rates are derived explicitly in terms of a problem's
parameters, with only mild restrictions imposed upon them. Simulation results
are provided to verify the theoretical developments made.Comment: 13 pages, 4 figures, 2 tables. Accepted to the 2018 IEEE CD
Asynchronous Decentralized 20 Questions for Adaptive Search
This paper considers the problem of adaptively searching for an unknown
target using multiple agents connected through a time-varying network topology.
Agents are equipped with sensors capable of fast information processing, and we
propose a decentralized collaborative algorithm for controlling their search
given noisy observations. Specifically, we propose decentralized extensions of
the adaptive query-based search strategy that combines elements from the 20
questions approach and social learning. Under standard assumptions on the
time-varying network dynamics, we prove convergence to correct consensus on the
value of the parameter as the number of iterations go to infinity. The
convergence analysis takes a novel approach using martingale-based techniques
combined with spectral graph theory. Our results establish that stability and
consistency can be maintained even with one-way updating and randomized
pairwise averaging, thus providing a scalable low complexity method with
performance guarantees. We illustrate the effectiveness of our algorithm for
random network topologies.Comment: 19 pages, Submitted. arXiv admin note: substantial text overlap with
arXiv:1312.784
Massively-concurrent Agent-based Evolutionary Computing
The fusion of the multi-agent paradigm with evolutionary computation yielded
promising results in many optimization problems. Evolutionary multi-agent
system (EMAS) are more similar to biological evolution than classical
evolutionary algorithms. However, technological limitations prevented the use
of fully asynchronous agents in previous EMAS implementations. In this paper we
present a new algorithm for agent-based evolutionary computations. The
individuals are represented as fully autonomous and asynchronous agents. An
efficient implementation of this algorithm was possible through the use of
modern technologies based on functional languages (namely Erlang and Scala),
which natively support lightweight processes and asynchronous communication.
Our experiments show that such an asynchronous approach is both faster and more
efficient in solving common optimization problems.Comment: Journal of Computational Science, Available online 29 July 201
Decentralized Schemes with Overlap for Solving Graph-Structured Optimization Problems
We present a new algorithmic paradigm for the decentralized solution of
graph-structured optimization problems that arise in the estimation and control
of network systems. A key and novel design concept of the proposed approach is
that it uses overlapping subdomains to promote and accelerate convergence. We
show that the algorithm converges if the size of the overlap is sufficiently
large and that the convergence rate improves exponentially with the size of the
overlap. The proposed approach provides a bridge between fully decentralized
and centralized architectures and is flexible in that it enables the
implementation of asynchronous schemes, handling of constraints, and balancing
of computing, communication, and data privacy needs. The proposed scheme is
tested in an estimation problem for a 9241-node power network and we show that
it outperforms the alternating direction method of multipliers
Edge Intelligence: Paving the Last Mile of Artificial Intelligence with Edge Computing
With the breakthroughs in deep learning, the recent years have witnessed a
booming of artificial intelligence (AI) applications and services, spanning
from personal assistant to recommendation systems to video/audio surveillance.
More recently, with the proliferation of mobile computing and
Internet-of-Things (IoT), billions of mobile and IoT devices are connected to
the Internet, generating zillions Bytes of data at the network edge. Driving by
this trend, there is an urgent need to push the AI frontiers to the network
edge so as to fully unleash the potential of the edge big data. To meet this
demand, edge computing, an emerging paradigm that pushes computing tasks and
services from the network core to the network edge, has been widely recognized
as a promising solution. The resulted new inter-discipline, edge AI or edge
intelligence, is beginning to receive a tremendous amount of interest. However,
research on edge intelligence is still in its infancy stage, and a dedicated
venue for exchanging the recent advances of edge intelligence is highly desired
by both the computer system and artificial intelligence communities. To this
end, we conduct a comprehensive survey of the recent research efforts on edge
intelligence. Specifically, we first review the background and motivation for
artificial intelligence running at the network edge. We then provide an
overview of the overarching architectures, frameworks and emerging key
technologies for deep learning model towards training/inference at the network
edge. Finally, we discuss future research opportunities on edge intelligence.
We believe that this survey will elicit escalating attentions, stimulate
fruitful discussions and inspire further research ideas on edge intelligence.Comment: Zhi Zhou, Xu Chen, En Li, Liekang Zeng, Ke Luo, and Junshan Zhang,
"Edge Intelligence: Paving the Last Mile of Artificial Intelligence with Edge
Computing," Proceedings of the IEE
- …