6,590 research outputs found

    Asynchronous Decentralized Parallel Stochastic Gradient Descent

    Full text link
    Most commonly used distributed machine learning systems are either synchronous or centralized asynchronous. Synchronous algorithms like AllReduce-SGD perform poorly in a heterogeneous environment, while asynchronous algorithms using a parameter server suffer from 1) communication bottleneck at parameter servers when workers are many, and 2) significantly worse convergence when the traffic to parameter server is congested. Can we design an algorithm that is robust in a heterogeneous environment, while being communication efficient and maintaining the best-possible convergence rate? In this paper, we propose an asynchronous decentralized stochastic gradient decent algorithm (AD-PSGD) satisfying all above expectations. Our theoretical analysis shows AD-PSGD converges at the optimal O(1/K)O(1/\sqrt{K}) rate as SGD and has linear speedup w.r.t. number of workers. Empirically, AD-PSGD outperforms the best of decentralized parallel SGD (D-PSGD), asynchronous parallel SGD (A-PSGD), and standard data parallel SGD (AllReduce-SGD), often by orders of magnitude in a heterogeneous environment. When training ResNet-50 on ImageNet with up to 128 GPUs, AD-PSGD converges (w.r.t epochs) similarly to the AllReduce-SGD, but each epoch can be up to 4-8X faster than its synchronous counterparts in a network-sharing HPC environment. To the best of our knowledge, AD-PSGD is the first asynchronous algorithm that achieves a similar epoch-wise convergence rate as AllReduce-SGD, at an over 100-GPU scale

    An Asynchronous, Decentralized Solution Framework for the Large Scale Unit Commitment Problem

    Full text link
    With increased reliance on cyber infrastructure, large scale power networks face new challenges owing to computational scalability. In this paper we focus on developing an asynchronous decentralized solution framework for the Unit Commitment(UC) problem for large scale power networks. We exploit the inherent asynchrony in a region based decomposition arising out of imbalance in regional subproblems to boost computational efficiency. A two phase algorithm is proposed that relies on the convex relaxation and privacy preserving valid inequalities in order to deliver algorithmic improvements. Our algorithm employs a novel interleaved binary mechanism that locally switches from the convex subproblem to its binary counterpart based on consistent local convergent behavior. We develop a high performance computing (HPC) oriented software framework that uses Message Passing Interface (MPI) to drive our benchmark studies. Our simulations performed on the IEEE 3012 bus case are benchmarked against the centralized and a state of the art synchronous decentralized method. The results demonstrate that the asynchronous method improves computational efficiency by a significant amount and provides a competitive solution quality rivaling the benchmark methods

    Asynchronous Decentralized Stochastic Optimization in Heterogeneous Networks

    Full text link
    We consider expected risk minimization in multi-agent systems comprised of distinct subsets of agents operating without a common time-scale. Each individual in the network is charged with minimizing the global objective function, which is an average of sum of the statistical average loss function of each agent in the network. Since agents are not assumed to observe data from identical distributions, the hypothesis that all agents seek a common action is violated, and thus the hypothesis upon which consensus constraints are formulated is violated. Thus, we consider nonlinear network proximity constraints which incentivize nearby nodes to make decisions which are close to one another but not necessarily coincide. Moreover, agents are not assumed to receive their sequentially arriving observations on a common time index, and thus seek to learn in an asynchronous manner. An asynchronous stochastic variant of the Arrow-Hurwicz saddle point method is proposed to solve this problem which operates by alternating primal stochastic descent steps and Lagrange multiplier updates which penalize the discrepancies between agents. This tool leads to an implementation that allows for each agent to operate asynchronously with local information only and message passing with neighbors. Our main result establishes that the proposed method yields convergence in expectation both in terms of the primal sub-optimality and constraint violation to radii of sizes O(T)\mathcal{O}(\sqrt{T}) and O(T3/4)\mathcal{O}(T^{3/4}), respectively. Empirical evaluation on an asynchronously operating wireless network that manages user channel interference through an adaptive communications pricing mechanism demonstrates that our theoretical results translates well to practice

    Decentralized Dynamic Optimization for Power Network Voltage Control

    Full text link
    Voltage control in power distribution networks has been greatly challenged by the increasing penetration of volatile and intermittent devices. These devices can also provide limited reactive power resources that can be used to regulate the network-wide voltage. A decentralized voltage control strategy can be designed by minimizing a quadratic voltage mismatch error objective using gradient-projection (GP) updates. Coupled with the power network flow, the local voltage can provide the instantaneous gradient information. This paper aims to analyze the performance of this decentralized GP-based voltage control design under two dynamic scenarios: i) the nodes perform the decentralized update in an asynchronous fashion, and ii) the network operating condition is time-varying. For the asynchronous voltage control, we improve the existing convergence condition by recognizing that the voltage based gradient is always up-to-date. By modeling the network dynamics using an autoregressive process and considering time-varying resource constraints, we provide an error bound in tracking the instantaneous optimal solution to the quadratic error objective. This result can be extended to more general \textit{constrained dynamic optimization} problems with smooth strongly convex objective functions under stochastic processes that have bounded iterative changes. Extensive numerical tests have been performed to demonstrate and validate our analytical results for realistic power networks

    Scalable Deep Learning on Distributed Infrastructures: Challenges, Techniques and Tools

    Full text link
    Deep Learning (DL) has had an immense success in the recent past, leading to state-of-the-art results in various domains such as image recognition and natural language processing. One of the reasons for this success is the increasing size of DL models and the proliferation of vast amounts of training data being available. To keep on improving the performance of DL, increasing the scalability of DL systems is necessary. In this survey, we perform a broad and thorough investigation on challenges, techniques and tools for scalable DL on distributed infrastructures. This incorporates infrastructures for DL, methods for parallel DL training, multi-tenant resource scheduling and the management of training and model data. Further, we analyze and compare 11 current open-source DL frameworks and tools and investigate which of the techniques are commonly implemented in practice. Finally, we highlight future research trends in DL systems that deserve further research.Comment: accepted at ACM Computing Surveys, to appea

    Asynchronous Distributed Optimization with Heterogeneous Regularizations and Normalizations

    Full text link
    As multi-agent networks grow in size and scale, they become increasingly difficult to synchronize, though agents must work together even when generating and sharing different information at different times. Targeting such cases, this paper presents an asynchronous optimization framework in which the time between successive communications and computations is unknown and unspecified for each agent. Agents' updates are carried out in blocks, with each agent updating only a small subset of all decision variables. To provide robustness to asynchrony, each agent uses an independently chosen Tikhonov regularization. Convergence is measured with respect to a weighted block-maximum norm in which convergence of agents' blocks can be measured in different p-norms and weighted differently to heterogeneously normalize problems. Asymptotic convergence is shown and convergence rates are derived explicitly in terms of a problem's parameters, with only mild restrictions imposed upon them. Simulation results are provided to verify the theoretical developments made.Comment: 13 pages, 4 figures, 2 tables. Accepted to the 2018 IEEE CD

    Asynchronous Decentralized 20 Questions for Adaptive Search

    Full text link
    This paper considers the problem of adaptively searching for an unknown target using multiple agents connected through a time-varying network topology. Agents are equipped with sensors capable of fast information processing, and we propose a decentralized collaborative algorithm for controlling their search given noisy observations. Specifically, we propose decentralized extensions of the adaptive query-based search strategy that combines elements from the 20 questions approach and social learning. Under standard assumptions on the time-varying network dynamics, we prove convergence to correct consensus on the value of the parameter as the number of iterations go to infinity. The convergence analysis takes a novel approach using martingale-based techniques combined with spectral graph theory. Our results establish that stability and consistency can be maintained even with one-way updating and randomized pairwise averaging, thus providing a scalable low complexity method with performance guarantees. We illustrate the effectiveness of our algorithm for random network topologies.Comment: 19 pages, Submitted. arXiv admin note: substantial text overlap with arXiv:1312.784

    Massively-concurrent Agent-based Evolutionary Computing

    Full text link
    The fusion of the multi-agent paradigm with evolutionary computation yielded promising results in many optimization problems. Evolutionary multi-agent system (EMAS) are more similar to biological evolution than classical evolutionary algorithms. However, technological limitations prevented the use of fully asynchronous agents in previous EMAS implementations. In this paper we present a new algorithm for agent-based evolutionary computations. The individuals are represented as fully autonomous and asynchronous agents. An efficient implementation of this algorithm was possible through the use of modern technologies based on functional languages (namely Erlang and Scala), which natively support lightweight processes and asynchronous communication. Our experiments show that such an asynchronous approach is both faster and more efficient in solving common optimization problems.Comment: Journal of Computational Science, Available online 29 July 201

    Decentralized Schemes with Overlap for Solving Graph-Structured Optimization Problems

    Full text link
    We present a new algorithmic paradigm for the decentralized solution of graph-structured optimization problems that arise in the estimation and control of network systems. A key and novel design concept of the proposed approach is that it uses overlapping subdomains to promote and accelerate convergence. We show that the algorithm converges if the size of the overlap is sufficiently large and that the convergence rate improves exponentially with the size of the overlap. The proposed approach provides a bridge between fully decentralized and centralized architectures and is flexible in that it enables the implementation of asynchronous schemes, handling of constraints, and balancing of computing, communication, and data privacy needs. The proposed scheme is tested in an estimation problem for a 9241-node power network and we show that it outperforms the alternating direction method of multipliers

    Edge Intelligence: Paving the Last Mile of Artificial Intelligence with Edge Computing

    Full text link
    With the breakthroughs in deep learning, the recent years have witnessed a booming of artificial intelligence (AI) applications and services, spanning from personal assistant to recommendation systems to video/audio surveillance. More recently, with the proliferation of mobile computing and Internet-of-Things (IoT), billions of mobile and IoT devices are connected to the Internet, generating zillions Bytes of data at the network edge. Driving by this trend, there is an urgent need to push the AI frontiers to the network edge so as to fully unleash the potential of the edge big data. To meet this demand, edge computing, an emerging paradigm that pushes computing tasks and services from the network core to the network edge, has been widely recognized as a promising solution. The resulted new inter-discipline, edge AI or edge intelligence, is beginning to receive a tremendous amount of interest. However, research on edge intelligence is still in its infancy stage, and a dedicated venue for exchanging the recent advances of edge intelligence is highly desired by both the computer system and artificial intelligence communities. To this end, we conduct a comprehensive survey of the recent research efforts on edge intelligence. Specifically, we first review the background and motivation for artificial intelligence running at the network edge. We then provide an overview of the overarching architectures, frameworks and emerging key technologies for deep learning model towards training/inference at the network edge. Finally, we discuss future research opportunities on edge intelligence. We believe that this survey will elicit escalating attentions, stimulate fruitful discussions and inspire further research ideas on edge intelligence.Comment: Zhi Zhou, Xu Chen, En Li, Liekang Zeng, Ke Luo, and Junshan Zhang, "Edge Intelligence: Paving the Last Mile of Artificial Intelligence with Edge Computing," Proceedings of the IEE
    • …
    corecore