Search CORE

3 research outputs found

Towards Federated Learning at Scale: System Design

Author: Bonawitz Keith
Eichner Hubert
Grieskamp Wolfgang
Huba Dzmitry
Ingerman Alex
Ivanov Vladimir
Kiddon Chloe
Konečný Jakub
Mazzocchi Stefano
McMahan H. Brendan
Petrou David
Ramage Daniel
Roselander Jason
Van Overveldt Timon
Publication venue
Publication date: 22/03/2019
Field of study

Federated Learning is a distributed machine learning approach which enables model training on a large corpus of decentralized data. We have built a scalable production system for Federated Learning in the domain of mobile devices, based on TensorFlow. In this paper, we describe the resulting high-level design, sketch some of the challenges and their solutions, and touch upon the open problems and future directions

arXiv.org e-Print Archive

Papaya: Practical, Private, and Scalable Federated Learning

Author: Huba Dzmitry
Malek Mani
Malik Kshitiz
Min Jesik
Nguyen John
Rabbat Mike
Shoumikhin Anthony
Srinivas Harish
Ustinov Pavel
Wang Kaikai
Wu Carole-Jean
Yousefpour Ashkan
Zhan Hongyuan
Zhu Ruiyu
Publication venue
Publication date: 25/04/2022
Field of study

Cross-device Federated Learning (FL) is a distributed learning paradigm with several challenges that differentiate it from traditional distributed learning, variability in the system characteristics on each device, and millions of clients coordinating with a central server being primary ones. Most FL systems described in the literature are synchronous - they perform a synchronized aggregation of model updates from individual clients. Scaling synchronous FL is challenging since increasing the number of clients training in parallel leads to diminishing returns in training speed, analogous to large-batch training. Moreover, stragglers hinder synchronous FL training. In this work, we outline a production asynchronous FL system design. Our work tackles the aforementioned issues, sketches of some of the system design challenges and their solutions, and touches upon principles that emerged from building a production FL system for millions of clients. Empirically, we demonstrate that asynchronous FL converges faster than synchronous FL when training across nearly one hundred million devices. In particular, in high concurrency settings, asynchronous FL is 5x faster and has nearly 8x less communication overhead than synchronous FL

arXiv.org e-Print Archive

Federated Learning with Buffered Asynchronous Aggregation

Author: Huba Dzmitry
Malek Mani
Malik Kshitiz
Nguyen John
Rabbat Michael
Yousefpour Ashkan
Zhan Hongyuan
Publication venue
Publication date: 10/12/2021
Field of study

Scalability and privacy are two critical concerns for cross-device federated learning (FL)systems. In this work, we identify that synchronous FL - synchronized aggregation of client updates in FL cannot scale efficiently beyond a few hundred clients training in parallel. It leads to diminishing returns in modelperformance and training speed, analogousto large-batch training. On the other hand, asynchronous aggregation of client updates in FL (i.e., asynchronous FL) alleviates the scalability issue. However, aggregating individualclient updates is incompatible with Secure Aggregation, which could result in an undesirable level of privacy for the system. To address these concerns, we propose a novel buffered asynchronous aggregation method, FedBuff, that is agnostic to the choice of optimizer, and combines the best properties of synchronous and asynchronous FL. We empirically demonstrate that FedBuff is 3.3x more efficient than synchronous FL and up to 2.5x more efficient than asynchronous FL, while being compatible with privacy-preserving technologies such as Secure Aggregation and differential privacy. We provide theoretical convergence guarantees in a smooth non-convex setting. Finally, we show that under differentially private training, FedBuff can outperform FedAvgM at low privacy settings and achieve the same utility for higher privacy settings

arXiv.org e-Print Archive