333 research outputs found
Continual Local Training for Better Initialization of Federated Models
Federated learning (FL) refers to the learning paradigm that trains machine
learning models directly in the decentralized systems consisting of smart edge
devices without transmitting the raw data, which avoids the heavy communication
costs and privacy concerns. Given the typical heterogeneous data distributions
in such situations, the popular FL algorithm \emph{Federated Averaging}
(FedAvg) suffers from weight divergence and thus cannot achieve a competitive
performance for the global model (denoted as the \emph{initial performance} in
FL) compared to centralized methods. In this paper, we propose the local
continual training strategy to address this problem. Importance weights are
evaluated on a small proxy dataset on the central server and then used to
constrain the local training. With this additional term, we alleviate the
weight divergence and continually integrate the knowledge on different local
clients into the global model, which ensures a better generalization ability.
Experiments on various FL settings demonstrate that our method significantly
improves the initial performance of federated models with few extra
communication costs.Comment: This paper has been accepted to 2020 IEEE International Conference on
Image Processing (ICIP 2020
Elastically-Constrained Meta-Learner for Federated Learning
Federated learning is an approach to collaboratively training machine
learning models for multiple parties that prohibit data sharing. One of the
challenges in federated learning is non-IID data between clients, as a single
model can not fit the data distribution for all clients. Meta-learning, such as
Per-FedAvg, is introduced to cope with the challenge. Meta-learning learns
shared initial parameters for all clients. Each client employs gradient descent
to adapt the initialization to local data distributions quickly to realize
model personalization. However, due to non-convex loss function and randomness
of sampling update, meta-learning approaches have unstable goals in local
adaptation for the same client. This fluctuation in different adaptation
directions hinders the convergence in meta-learning. To overcome this
challenge, we use the historical local adapted model to restrict the direction
of the inner loop and propose an elastic-constrained method. As a result, the
current round inner loop keeps historical goals and adapts to better solutions.
Experiments show our method boosts meta-learning convergence and improves
personalization without additional calculation and communication. Our method
achieved SOTA on all metrics in three public datasets.Comment: FL-IJCAI'2
Jointly Exploring Client Drift and Catastrophic Forgetting in Dynamic Learning
Federated and Continual Learning have emerged as potential paradigms for the
robust and privacy-aware use of Deep Learning in dynamic environments. However,
Client Drift and Catastrophic Forgetting are fundamental obstacles to
guaranteeing consistent performance. Existing work only addresses these
problems separately, which neglects the fact that the root cause behind both
forms of performance deterioration is connected. We propose a unified analysis
framework for building a controlled test environment for Client Drift -- by
perturbing a defined ratio of clients -- and Catastrophic Forgetting -- by
shifting all clients with a particular strength. Our framework further
leverages this new combined analysis by generating a 3D landscape of the
combined performance impact from both. We demonstrate that the performance drop
through Client Drift, caused by a certain share of shifted clients, is
correlated to the drop from Catastrophic Forgetting resulting from a
corresponding shift strength. Correlation tests between both problems for
Computer Vision (CelebA) and Medical Imaging (PESO) support this new
perspective, with an average Pearson rank correlation coefficient of over 0.94.
Our framework's novel ability of combined spatio-temporal shift analysis allows
us to investigate how both forms of distribution shift behave in mixed
scenarios, opening a new pathway for better generalization. We show that a
combination of moderate Client Drift and Catastrophic Forgetting can even
improve the performance of the resulting model (causing a "Generalization
Bump") compared to when only one of the shifts occurs individually. We apply a
simple and commonly used method from Continual Learning in the federated
setting and observe this phenomenon to be reoccurring, leveraging the ability
of our framework to analyze existing and novel methods for Federated and
Continual Learning
SphereFed: Hyperspherical Federated Learning
Federated Learning aims at training a global model from multiple
decentralized devices (i.e. clients) without exchanging their private local
data. A key challenge is the handling of non-i.i.d. (independent identically
distributed) data across multiple clients that may induce disparities of their
local features. We introduce the Hyperspherical Federated Learning (SphereFed)
framework to address the non-i.i.d. issue by constraining learned
representations of data points to be on a unit hypersphere shared by clients.
Specifically, all clients learn their local representations by minimizing the
loss with respect to a fixed classifier whose weights span the unit
hypersphere. After federated training in improving the global model, this
classifier is further calibrated with a closed-form solution by minimizing a
mean squared loss. We show that the calibration solution can be computed
efficiently and distributedly without direct access of local data. Extensive
experiments indicate that our SphereFed approach is able to improve the
accuracy of multiple existing federated learning algorithms by a considerable
margin (up to 6% on challenging datasets) with enhanced computation and
communication efficiency across datasets and model architectures.Comment: European Conference on Computer Vision 202
When Foundation Model Meets Federated Learning: Motivations, Challenges, and Future Directions
The intersection of the Foundation Model (FM) and Federated Learning (FL)
provides mutual benefits, presents a unique opportunity to unlock new
possibilities in AI research, and address critical challenges in AI and
real-world applications. FL expands the availability of data for FMs and
enables computation sharing, distributing the training process and reducing the
burden on FL participants. It promotes collaborative FM development,
democratizing the process and fostering inclusivity and innovation. On the
other hand, FM, with its enormous size, pre-trained knowledge, and exceptional
performance, serves as a robust starting point for FL, facilitating faster
convergence and better performance under non-iid data. Additionally, leveraging
FM to generate synthetic data enriches data diversity, reduces overfitting, and
preserves privacy. By examining the interplay between FL and FM, this paper
aims to deepen the understanding of their synergistic relationship,
highlighting the motivations, challenges, and future directions. Through an
exploration of the challenges faced by FL and FM individually and their
interconnections, we aim to inspire future research directions that can further
enhance both fields, driving advancements and propelling the development of
privacy-preserving and scalable AI systems
TinyReptile: TinyML with Federated Meta-Learning
Tiny machine learning (TinyML) is a rapidly growing field aiming to
democratize machine learning (ML) for resource-constrained microcontrollers
(MCUs). Given the pervasiveness of these tiny devices, it is inherent to ask
whether TinyML applications can benefit from aggregating their knowledge.
Federated learning (FL) enables decentralized agents to jointly learn a global
model without sharing sensitive local data. However, a common global model may
not work for all devices due to the complexity of the actual deployment
environment and the heterogeneity of the data available on each device. In
addition, the deployment of TinyML hardware has significant computational and
communication constraints, which traditional ML fails to address. Considering
these challenges, we propose TinyReptile, a simple but efficient algorithm
inspired by meta-learning and online learning, to collaboratively learn a solid
initialization for a neural network (NN) across tiny devices that can be
quickly adapted to a new device with respect to its data. We demonstrate
TinyReptile on Raspberry Pi 4 and Cortex-M4 MCU with only 256-KB RAM. The
evaluations on various TinyML use cases confirm a resource reduction and
training time saving by at least two factors compared with baseline algorithms
with comparable performance.Comment: Accepted by The International Joint Conference on Neural Network
(IJCNN) 202
- …