62 research outputs found
Is Normalization Indispensable for Multi-domain Federated Learning?
Federated learning (FL) enhances data privacy with collaborative in-situ
training on decentralized clients. Nevertheless, FL encounters challenges due
to non-independent and identically distributed (non-i.i.d) data, leading to
potential performance degradation and hindered convergence. While prior studies
predominantly addressed the issue of skewed label distribution, our research
addresses a crucial yet frequently overlooked problem known as multi-domain FL.
In this scenario, clients' data originate from diverse domains with distinct
feature distributions, as opposed to label distributions. To address the
multi-domain problem in FL, we propose a novel method called Federated learning
Without normalizations (FedWon). FedWon draws inspiration from the observation
that batch normalization (BN) faces challenges in effectively modeling the
statistics of multiple domains, while alternative normalization techniques
possess their own limitations. In order to address these issues, FedWon
eliminates all normalizations in FL and reparameterizes convolution layers with
scaled weight standardization. Through comprehensive experimentation on four
datasets and four models, our results demonstrate that FedWon surpasses both
FedAvg and the current state-of-the-art method (FedBN) across all experimental
setups, achieving notable improvements of over 10% in certain domains.
Furthermore, FedWon is versatile for both cross-silo and cross-device FL,
exhibiting strong performance even with a batch size as small as 1, thereby
catering to resource-constrained devices. Additionally, FedWon effectively
tackles the challenge of skewed label distribution
When Foundation Model Meets Federated Learning: Motivations, Challenges, and Future Directions
The intersection of the Foundation Model (FM) and Federated Learning (FL)
provides mutual benefits, presents a unique opportunity to unlock new
possibilities in AI research, and address critical challenges in AI and
real-world applications. FL expands the availability of data for FMs and
enables computation sharing, distributing the training process and reducing the
burden on FL participants. It promotes collaborative FM development,
democratizing the process and fostering inclusivity and innovation. On the
other hand, FM, with its enormous size, pre-trained knowledge, and exceptional
performance, serves as a robust starting point for FL, facilitating faster
convergence and better performance under non-iid data. Additionally, leveraging
FM to generate synthetic data enriches data diversity, reduces overfitting, and
preserves privacy. By examining the interplay between FL and FM, this paper
aims to deepen the understanding of their synergistic relationship,
highlighting the motivations, challenges, and future directions. Through an
exploration of the challenges faced by FL and FM individually and their
interconnections, we aim to inspire future research directions that can further
enhance both fields, driving advancements and propelling the development of
privacy-preserving and scalable AI systems
MAS: Towards Resource-Efficient Federated Multiple-Task Learning
Federated learning (FL) is an emerging distributed machine learning method
that empowers in-situ model training on decentralized edge devices. However,
multiple simultaneous FL tasks could overload resource-constrained devices. In
this work, we propose the first FL system to effectively coordinate and train
multiple simultaneous FL tasks. We first formalize the problem of training
simultaneous FL tasks. Then, we present our new approach, MAS (Merge and
Split), to optimize the performance of training multiple simultaneous FL tasks.
MAS starts by merging FL tasks into an all-in-one FL task with a multi-task
architecture. After training for a few rounds, MAS splits the all-in-one FL
task into two or more FL tasks by using the affinities among tasks measured
during the all-in-one training. It then continues training each split of FL
tasks based on model parameters from the all-in-one training. Extensive
experiments demonstrate that MAS outperforms other methods while reducing
training time by 2x and reducing energy consumption by 40%. We hope this work
will inspire the community to further study and optimize training simultaneous
FL tasks.Comment: ICCV'23. arXiv admin note: substantial text overlap with
arXiv:2207.0420
Reducing Communication for Split Learning by Randomized Top-k Sparsification
Split learning is a simple solution for Vertical Federated Learning (VFL),
which has drawn substantial attention in both research and application due to
its simplicity and efficiency. However, communication efficiency is still a
crucial issue for split learning. In this paper, we investigate multiple
communication reduction methods for split learning, including cut layer size
reduction, top-k sparsification, quantization, and L1 regularization. Through
analysis of the cut layer size reduction and top-k sparsification, we further
propose randomized top-k sparsification, to make the model generalize and
converge better. This is done by selecting top-k elements with a large
probability while also having a small probability to select non-top-k elements.
Empirical results show that compared with other communication-reduction
methods, our proposed randomized top-k sparsification achieves a better model
performance under the same compression level.Comment: Accepted by IJCAI 202
- …