254 research outputs found
Asynchronous Optimization Over Heterogeneous Networks via Consensus ADMM
This paper considers the distributed optimization of a sum of locally
observable, non-convex functions. The optimization is performed over a
multi-agent networked system, and each local function depends only on a subset
of the variables. An asynchronous and distributed alternating directions method
of multipliers (ADMM) method that allows the nodes to defer or skip the
computation and transmission of updates is proposed in the paper. The proposed
algorithm utilizes different approximations in the update step, resulting in
proximal and majorized ADMM variants. Both variants are shown to converge to a
local minimum, under certain regularity conditions. The proposed asynchronous
algorithms are also applied to the problem of cooperative localization in
wireless ad hoc networks, where it is shown to outperform the other
state-of-the-art localization algorithms.Comment: Submitted to Transactions on signal and information processing over
Network
Asynchronous ADMM for Distributed Non-Convex Optimization in Power Systems
Large scale, non-convex optimization problems arising in many complex
networks such as the power system call for efficient and scalable distributed
optimization algorithms. Existing distributed methods are usually iterative and
require synchronization of all workers at each iteration, which is hard to
scale and could result in the under-utilization of computation resources due to
the heterogeneity of the subproblems. To address those limitations of
synchronous schemes, this paper proposes an asynchronous distributed
optimization method based on the Alternating Direction Method of Multipliers
(ADMM) for non-convex optimization. The proposed method only requires local
communications and allows each worker to perform local updates with information
from a subset of but not all neighbors. We provide sufficient conditions on the
problem formulation, the choice of algorithm parameter and network delay, and
show that under those mild conditions, the proposed asynchronous ADMM method
asymptotically converges to the KKT point of the non-convex problem. We
validate the effectiveness of asynchronous ADMM by applying it to the Optimal
Power Flow problem in multiple power systems and show that the convergence of
the proposed asynchronous scheme could be faster than its synchronous
counterpart in large-scale applications
Unwrapping ADMM: Efficient Distributed Computing via Transpose Reduction
Recent approaches to distributed model fitting rely heavily on consensus
ADMM, where each node solves small sub-problems using only local data. We
propose iterative methods that solve {\em global} sub-problems over an entire
distributed dataset. This is possible using transpose reduction strategies that
allow a single node to solve least-squares over massive datasets without
putting all the data in one place. This results in simple iterative methods
that avoid the expensive inner loops required for consensus methods. To
demonstrate the efficiency of this approach, we fit linear classifiers and
sparse linear models to datasets over 5 Tb in size using a distributed
implementation with over 7000 cores in far less time than previous approaches
Impact of Communication Delay on Asynchronous Distributed Optimal Power Flow Using ADMM
Distributed optimization has attracted lots of attention in the operation of
power systems in recent years, where a large area is decomposed into smaller
control regions each solving a local optimization problem with periodic
information exchange with neighboring regions. However, most distributed
optimization methods are iterative and require synchronization of all regions
at each iteration, which is hard to achieve without a centralized coordinator
and might lead to under-utilization of computation resources due to the
heterogeneity of the regions. To address such limitations of synchronous
schemes, this paper investigates the applicability of asynchronous distributed
optimization methods to power system optimization. Particularly, we focus on
solving the AC Optimal Power Flow problem and propose an algorithmic framework
based on the Alternating Direction Method of Multipliers (ADMM) method that
allows the regions to perform local updates with information received from a
subset of but not all neighbors. Through experimental studies, we demonstrate
that the convergence performance of the proposed asynchronous scheme is
dependent on the communication delay of passing messages among the regions.
Under mild communication delays, the proposed scheme can achieve comparable or
even faster convergence compared with its synchronous counterpart, which can be
used as a good alternative to centralized or synchronous distributed
optimization approaches.Comment: SmartGridComm 201
An Asynchronous, Decentralized Solution Framework for the Large Scale Unit Commitment Problem
With increased reliance on cyber infrastructure, large scale power networks
face new challenges owing to computational scalability. In this paper we focus
on developing an asynchronous decentralized solution framework for the Unit
Commitment(UC) problem for large scale power networks. We exploit the inherent
asynchrony in a region based decomposition arising out of imbalance in regional
subproblems to boost computational efficiency. A two phase algorithm is
proposed that relies on the convex relaxation and privacy preserving valid
inequalities in order to deliver algorithmic improvements. Our algorithm
employs a novel interleaved binary mechanism that locally switches from the
convex subproblem to its binary counterpart based on consistent local
convergent behavior. We develop a high performance computing (HPC) oriented
software framework that uses Message Passing Interface (MPI) to drive our
benchmark studies. Our simulations performed on the IEEE 3012 bus case are
benchmarked against the centralized and a state of the art synchronous
decentralized method. The results demonstrate that the asynchronous method
improves computational efficiency by a significant amount and provides a
competitive solution quality rivaling the benchmark methods
Asynchronous Incremental Stochastic Dual Descent Algorithm for Network Resource Allocation
Stochastic network optimization problems entail finding resource allocation
policies that are optimum on an average but must be designed in an online
fashion. Such problems are ubiquitous in communication networks, where
resources such as energy and bandwidth are divided among nodes to satisfy
certain long-term objectives. This paper proposes an asynchronous incremental
dual decent resource allocation algorithm that utilizes delayed stochastic
{gradients} for carrying out its updates. The proposed algorithm is well-suited
to heterogeneous networks as it allows the computationally-challenged or
energy-starved nodes to, at times, postpone the updates. The asymptotic
analysis of the proposed algorithm is carried out, establishing dual
convergence under both, constant and diminishing step sizes. It is also shown
that with constant step size, the proposed resource allocation policy is
asymptotically near-optimal. An application involving multi-cell coordinated
beamforming is detailed, demonstrating the usefulness of the proposed
algorithm
Asynchronous Decentralized Parallel Stochastic Gradient Descent
Most commonly used distributed machine learning systems are either
synchronous or centralized asynchronous. Synchronous algorithms like
AllReduce-SGD perform poorly in a heterogeneous environment, while asynchronous
algorithms using a parameter server suffer from 1) communication bottleneck at
parameter servers when workers are many, and 2) significantly worse convergence
when the traffic to parameter server is congested. Can we design an algorithm
that is robust in a heterogeneous environment, while being communication
efficient and maintaining the best-possible convergence rate? In this paper, we
propose an asynchronous decentralized stochastic gradient decent algorithm
(AD-PSGD) satisfying all above expectations. Our theoretical analysis shows
AD-PSGD converges at the optimal rate as SGD and has linear
speedup w.r.t. number of workers. Empirically, AD-PSGD outperforms the best of
decentralized parallel SGD (D-PSGD), asynchronous parallel SGD (A-PSGD), and
standard data parallel SGD (AllReduce-SGD), often by orders of magnitude in a
heterogeneous environment. When training ResNet-50 on ImageNet with up to 128
GPUs, AD-PSGD converges (w.r.t epochs) similarly to the AllReduce-SGD, but each
epoch can be up to 4-8X faster than its synchronous counterparts in a
network-sharing HPC environment. To the best of our knowledge, AD-PSGD is the
first asynchronous algorithm that achieves a similar epoch-wise convergence
rate as AllReduce-SGD, at an over 100-GPU scale
Asynchronous Distributed ADMM for Large-Scale Optimization- Part I: Algorithm and Convergence Analysis
Aiming at solving large-scale learning problems, this paper studies
distributed optimization methods based on the alternating direction method of
multipliers (ADMM). By formulating the learning problem as a consensus problem,
the ADMM can be used to solve the consensus problem in a fully parallel fashion
over a computer network with a star topology. However, traditional synchronized
computation does not scale well with the problem size, as the speed of the
algorithm is limited by the slowest workers. This is particularly true in a
heterogeneous network where the computing nodes experience different
computation and communication delays. In this paper, we propose an asynchronous
distributed ADMM (AD-AMM) which can effectively improve the time efficiency of
distributed optimization. Our main interest lies in analyzing the convergence
conditions of the AD-ADMM, under the popular partially asynchronous model,
which is defined based on a maximum tolerable delay of the network.
Specifically, by considering general and possibly non-convex cost functions, we
show that the AD-ADMM is guaranteed to converge to the set of
Karush-Kuhn-Tucker (KKT) points as long as the algorithm parameters are chosen
appropriately according to the network delay. We further illustrate that the
asynchrony of the ADMM has to be handled with care, as slightly modifying the
implementation of the AD-ADMM can jeopardize the algorithm convergence, even
under a standard convex setting.Comment: 37 page
Revisiting Large Scale Distributed Machine Learning
Nowadays, with the widespread of smartphones and other portable gadgets
equipped with a variety of sensors, data is ubiquitous available and the focus
of machine learning has shifted from being able to infer from small training
samples to dealing with large scale high-dimensional data. In domains such as
personal healthcare applications, which motivates this survey, distributed
machine learning is a promising line of research, both for scaling up learning
algorithms, but mostly for dealing with data which is inherently produced at
different locations. This report offers a thorough overview of and
state-of-the-art algorithms for distributed machine learning, for both
supervised and unsupervised learning, ranging from simple linear logistic
regression to graphical models and clustering. We propose future directions for
most categories, specific to the potential personal healthcare applications.
With this in mind, the report focuses on how security and low communication
overhead can be assured in the specific case of a strictly client-server
architectural model. As particular directions we provides an exhaustive
presentation of an empirical clustering algorithm, k-windows, and proposed an
asynchronous distributed machine learning algorithm that would scale well and
also would be computationally cheap and easy to implement
Distributed Optimization for Smart Cyber-Physical Networks
The presence of embedded electronics and communication capabilities as well as sensing and control in smart devices has given rise to the novel concept of cyber-physical networks, in which agents aim at cooperatively solving complex tasks by local computation and communication. Numerous estimation, learning, decision and control tasks in smart networks involve the solution of large-scale, structured optimization problems in which network agents have only a partial knowledge of the whole problem. Distributed optimization aims at designing local computation and communication rules for the network processors allowing them to cooperatively solve the global optimization problem without relying on any central unit. The purpose of this survey is to provide an introduction to distributed optimization methodologies. Principal approaches, namely (primal) consensus-based, duality-based and constraint exchange methods, are formalized. An analysis of the basic schemes is supplied, and state-of-the-art extensions are reviewed
- …