14 research outputs found

    Networked Federated Learning

    Full text link
    We develop the theory and algorithmic toolbox for networked federated learning in decentralized collections of local datasets with an intrinsic network structure. This network structure arises from domain-specific notions of similarity between local datasets. Different notions of similarity are induced by spatio-temporal proximity, statistical dependencies or functional relations. Our main conceptual contribution is to formulate networked federated learning using a generalized total variation minimization. This formulation unifies and considerably extends existing federated multi-task learning methods. It is highly flexible and can be combined with a broad range of parametric models including Lasso or deep neural networks. Our main algorithmic contribution is a novel networked federated learning algorithm which is well suited for distributed computing environments such as edge computing over wireless networks. This algorithm is robust against inexact computations arising from limited computational resources including processing time or bandwidth. For local models resulting in convex problems, we derive precise conditions on the local models and their network structure such that our algorithm learns nearly optimal local models. Our analysis reveals an interesting interplay between the convex geometry of local models and the (cluster-) geometry of their network structure

    Autonomy and Intelligence in the Computing Continuum: Challenges, Enablers, and Future Directions for Orchestration

    Full text link
    Future AI applications require performance, reliability and privacy that the existing, cloud-dependant system architectures cannot provide. In this article, we study orchestration in the device-edge-cloud continuum, and focus on AI for edge, that is, the AI methods used in resource orchestration. We claim that to support the constantly growing requirements of intelligent applications in the device-edge-cloud computing continuum, resource orchestration needs to embrace edge AI and emphasize local autonomy and intelligence. To justify the claim, we provide a general definition for continuum orchestration, and look at how current and emerging orchestration paradigms are suitable for the computing continuum. We describe certain major emerging research themes that may affect future orchestration, and provide an early vision of an orchestration paradigm that embraces those research themes. Finally, we survey current key edge AI methods and look at how they may contribute into fulfilling the vision of future continuum orchestration.Comment: 50 pages, 8 figures (Revised content in all sections, added figures and new section

    Federated Learning of Artificial Neural Networks

    Get PDF
    A jelenlegi, legszélesebb körben alkalmazható gépi tanulás (ML) modellek, és különösképp mesterséges neurális hálók betanítása rendkívül nagy mennyiségű adatot és jelentős számítási kapacitást igényel. A Federált Tanulás (FL) kutatás fókuszában az ML modellek kollaboratív tanítása áll, napjaink heterogén, földrajzilag is erősen elosztott információs infrastruktúráján. Az FL célja ezáltal eloszlatni a tanulás számítási igényét a résztvevők (node-ok) között, az adatot annak keletkezési helyén feldolgozva, míg tanulás maga a node-okon számított módosítási igények (update-ek) időszakonkénti begyűjtésével, összegzésével és a frissített modell szétosztásával történik. Az FL-lel kapcsolatos kutatások, a mi megátsunk szerint három főbb irányba folynak: (1) Az első irány az általánosan elfogadott federált tanulási metódus, a Federált Átlagolás (FedAvg) életszerű környezetben való alkalmazásának kérdéseivel foglalkozik, azaz hogyan lehetséges a szükséges kommunikációs és számítási kapacitás biztosítása. (2) A második irány a FedAvg algoritmus alkalmazásakor fellépő problémákra fókuszál, úgymint a modell csökkenő általános pontosága, valamint a közös modell potenciálisan elégtelen teljesítménye a végfelhasználóknál. (3) A harmadik sokat kutatott téma pedig a résztvevők bizalmas adatinak minél erősebb védelmének módjait vizsgálja. A disszertációban az mesterséges neurális hálók federált tanításának az ezen, általunk a legfontosabbnak ítélt irányokban történő fejlesztésére irányuló munkánkat mutatom be. Az bemutatott metódusok az egyes problémák ehnyhítésére a következő ötleteken alapulnak: (1) A FedAvg algoritmus peer-to-peer átalakítása (2) a múltbeli állapotokon alapuló optimalizációs metódusok alkalmazása; valamint (3) a gradiensek használatát nem igénylő természet által inspirált optimalizációs módszerek alkalmazása

    Improving the performance of dataflow systems for deep neural network training

    No full text
    Deep neural networks (DNNs) have led to significant advancements in machine learning. With deep structure and flexible model parameterisation, they exhibit state-of-the-art accuracies for many complex tasks e.g. image recognition. To achieve this, models are trained iteratively over large datasets. This process involves expensive matrix operations, making it time-consuming to obtain converged models. To accelerate training, dataflow systems parallelise computation. A scalable approach is to use parameter server framework: it has workers that train model replicas in parallel and parameter servers that synchronise the replicas to ensure the convergence. With distributed DNN systems, there are three challenges that determine the training completion time. In this thesis, we propose practical and effective techniques to address each of these challenges. Since frequent model synchronisation results in high network utilisation, the parameter server approach can suffer from network bottlenecks, thus requiring decisions on resource allocation. Our idea is to use all available network bandwidth and synchronise subject to the available bandwidth. We present Ako, a DNN system that uses partial gradient exchange for synchronising replicas in a peer-to-peer fashion. We show that our technique exhibits a 25% lower convergence time than a hand-tuned parameter-server deployments. For a long training, the compute efficiency of worker nodes is important. We argue that processing hardware should be fully utilised for the best speed-up. The key observation is it is possible to overlap the execution of several matrix operations with other workloads. We describe Crossbow, a GPU-based system that maximises hardware utilisation. By using a multi-streaming scheduler, multiple models are trained in parallel on GPU and achieve a 2.3x speed-up compared to a state-of-the-art system. The choice of model configuration for replicas also directly determines convergence quality. Dataflow systems are used for exploring the promising configurations but provide little support for efficient exploratory workflows. We present Meta-dataflow (MDF), a dataflow model that expresses complex workflows. By taking into account all configurations as a unified workflow, MDFs efficiently reduce time spent on configuration exploration.Open Acces

    27th Annual European Symposium on Algorithms: ESA 2019, September 9-11, 2019, Munich/Garching, Germany

    Get PDF
    corecore