135 research outputs found
Is Normalization Indispensable for Multi-domain Federated Learning?
Federated learning (FL) enhances data privacy with collaborative in-situ
training on decentralized clients. Nevertheless, FL encounters challenges due
to non-independent and identically distributed (non-i.i.d) data, leading to
potential performance degradation and hindered convergence. While prior studies
predominantly addressed the issue of skewed label distribution, our research
addresses a crucial yet frequently overlooked problem known as multi-domain FL.
In this scenario, clients' data originate from diverse domains with distinct
feature distributions, as opposed to label distributions. To address the
multi-domain problem in FL, we propose a novel method called Federated learning
Without normalizations (FedWon). FedWon draws inspiration from the observation
that batch normalization (BN) faces challenges in effectively modeling the
statistics of multiple domains, while alternative normalization techniques
possess their own limitations. In order to address these issues, FedWon
eliminates all normalizations in FL and reparameterizes convolution layers with
scaled weight standardization. Through comprehensive experimentation on four
datasets and four models, our results demonstrate that FedWon surpasses both
FedAvg and the current state-of-the-art method (FedBN) across all experimental
setups, achieving notable improvements of over 10% in certain domains.
Furthermore, FedWon is versatile for both cross-silo and cross-device FL,
exhibiting strong performance even with a batch size as small as 1, thereby
catering to resource-constrained devices. Additionally, FedWon effectively
tackles the challenge of skewed label distribution
When Foundation Model Meets Federated Learning: Motivations, Challenges, and Future Directions
The intersection of the Foundation Model (FM) and Federated Learning (FL)
provides mutual benefits, presents a unique opportunity to unlock new
possibilities in AI research, and address critical challenges in AI and
real-world applications. FL expands the availability of data for FMs and
enables computation sharing, distributing the training process and reducing the
burden on FL participants. It promotes collaborative FM development,
democratizing the process and fostering inclusivity and innovation. On the
other hand, FM, with its enormous size, pre-trained knowledge, and exceptional
performance, serves as a robust starting point for FL, facilitating faster
convergence and better performance under non-iid data. Additionally, leveraging
FM to generate synthetic data enriches data diversity, reduces overfitting, and
preserves privacy. By examining the interplay between FL and FM, this paper
aims to deepen the understanding of their synergistic relationship,
highlighting the motivations, challenges, and future directions. Through an
exploration of the challenges faced by FL and FM individually and their
interconnections, we aim to inspire future research directions that can further
enhance both fields, driving advancements and propelling the development of
privacy-preserving and scalable AI systems
Towards Fundamentally Scalable Model Selection: Asymptotically Fast Update and Selection
The advancement of deep learning technologies is bringing new models every
day, motivating the study of scalable model selection. An ideal model selection
scheme should minimally support two operations efficiently over a large pool of
candidate models: update, which involves either adding a new candidate model or
removing an existing candidate model, and selection, which involves locating
highly performing models for a given task. However, previous solutions to model
selection require high computational complexity for at least one of these two
operations. In this work, we target fundamentally (more) scalable model
selection that supports asymptotically fast update and asymptotically fast
selection at the same time. Firstly, we define isolated model embedding, a
family of model selection schemes supporting asymptotically fast update and
selection: With respect to the number of candidate models , the update
complexity is O(1) and the selection consists of a single sweep over
vectors in addition to O(1) model operations. Isolated model embedding also
implies several desirable properties for applications. Secondly, we present
Standardized Embedder, an empirical realization of isolated model embedding. We
assess its effectiveness by using it to select representations from a pool of
100 pre-trained vision models for classification tasks and measuring the
performance gaps between the selected models and the best candidates with a
linear probing protocol. Experiments suggest our realization is effective in
selecting models with competitive performances and highlight isolated model
embedding as a promising direction towards model selection that is
fundamentally (more) scalable.Comment: 19 pages, 8 figure
TARGET: Federated Class-Continual Learning via Exemplar-Free Distillation
This paper focuses on an under-explored yet important problem: Federated
Class-Continual Learning (FCCL), where new classes are dynamically added in
federated learning. Existing FCCL works suffer from various limitations, such
as requiring additional datasets or storing the private data from previous
tasks. In response, we first demonstrate that non-IID data exacerbates
catastrophic forgetting issue in FL. Then we propose a novel method called
TARGET (federat\textbf{T}ed cl\textbf{A}ss-continual lea\textbf{R}nin\textbf{G}
via \textbf{E}xemplar-free dis\textbf{T}illation), which alleviates
catastrophic forgetting in FCCL while preserving client data privacy. Our
proposed method leverages the previously trained global model to transfer
knowledge of old tasks to the current task at the model level. Moreover, a
generator is trained to produce synthetic data to simulate the global
distribution of data on each client at the data level. Compared to previous
FCCL methods, TARGET does not require any additional datasets or storing real
data from previous tasks, which makes it ideal for data-sensitive scenarios.Comment: ICCV 202
MAS: Towards Resource-Efficient Federated Multiple-Task Learning
Federated learning (FL) is an emerging distributed machine learning method
that empowers in-situ model training on decentralized edge devices. However,
multiple simultaneous FL tasks could overload resource-constrained devices. In
this work, we propose the first FL system to effectively coordinate and train
multiple simultaneous FL tasks. We first formalize the problem of training
simultaneous FL tasks. Then, we present our new approach, MAS (Merge and
Split), to optimize the performance of training multiple simultaneous FL tasks.
MAS starts by merging FL tasks into an all-in-one FL task with a multi-task
architecture. After training for a few rounds, MAS splits the all-in-one FL
task into two or more FL tasks by using the affinities among tasks measured
during the all-in-one training. It then continues training each split of FL
tasks based on model parameters from the all-in-one training. Extensive
experiments demonstrate that MAS outperforms other methods while reducing
training time by 2x and reducing energy consumption by 40%. We hope this work
will inspire the community to further study and optimize training simultaneous
FL tasks.Comment: ICCV'23. arXiv admin note: substantial text overlap with
arXiv:2207.0420
EasyFL: A Low-code Federated Learning Platform For Dummies
Academia and industry have developed several platforms to support the popular
privacy-preserving distributed learning method -- Federated Learning (FL).
However, these platforms are complex to use and require a deep understanding of
FL, which imposes high barriers to entry for beginners, limits the productivity
of researchers, and compromises deployment efficiency. In this paper, we
propose the first low-code FL platform, EasyFL, to enable users with various
levels of expertise to experiment and prototype FL applications with little
coding. We achieve this goal while ensuring great flexibility and extensibility
for customization by unifying simple API design, modular design, and granular
training flow abstraction. With only a few lines of code, EasyFL empowers them
with many out-of-the-box functionalities to accelerate experimentation and
deployment. These practical functionalities are heterogeneity simulation,
comprehensive tracking, distributed training optimization, and seamless
deployment. They are proposed based on challenges identified in the proposed FL
life cycle. Compared with other platforms, EasyFL not only requires just three
lines of code (at least 10x lesser) to build a vanilla FL application but also
incurs lower training overhead. Besides, our evaluations demonstrate that
EasyFL expedites distributed training by 1.5x. It also improves the efficiency
of deployment. We believe that EasyFL will increase the productivity of
researchers and democratize FL to wider audiences
HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face
Solving complicated AI tasks with different domains and modalities is a key
step toward artificial general intelligence. While there are numerous AI models
available for various domains and modalities, they cannot handle complicated AI
tasks autonomously. Considering large language models (LLMs) have exhibited
exceptional abilities in language understanding, generation, interaction, and
reasoning, we advocate that LLMs could act as a controller to manage existing
AI models to solve complicated AI tasks, with language serving as a generic
interface to empower this. Based on this philosophy, we present HuggingGPT, an
LLM-powered agent that leverages LLMs (e.g., ChatGPT) to connect various AI
models in machine learning communities (e.g., Hugging Face) to solve AI tasks.
Specifically, we use ChatGPT to conduct task planning when receiving a user
request, select models according to their function descriptions available in
Hugging Face, execute each subtask with the selected AI model, and summarize
the response according to the execution results. By leveraging the strong
language capability of ChatGPT and abundant AI models in Hugging Face,
HuggingGPT can tackle a wide range of sophisticated AI tasks spanning different
modalities and domains and achieve impressive results in language, vision,
speech, and other challenging tasks, which paves a new way towards the
realization of artificial general intelligence
- …