5,960 research outputs found
FLASH: Heterogeneity-Aware Federated Learning at Scale
Federated learning (FL) becomes a promising machine learning paradigm. The impact of heterogeneous hardware specifications and dynamic states on the FL process has not yet been studied systematically. This paper presents the first large-scale study of this impact based on real-world data collected from 136k smartphones. We conducted extensive experiments on our proposed heterogeneity-aware FL platform namely FLASH , to systematically explore the performance of state-of-the-art FL algorithms and key FL configurations in heterogeneity-aware and -unaware settings, finding the following. (1) Heterogeneity causes accuracy to drop by up to 9.2% and convergence time to increase by 2.32×. (2) Heterogeneity negatively impacts popular aggregation algorithms, e.g., the accuracy variance reduction brought by q-FedAvg drops by 17.5%. (3) Heterogeneity does not worsen the accuracy loss caused by gradient-compression algorithms significantly, but it compromises the convergence time by up to 2.5×. (4) Heterogeneity hinders client-selection algorithms from selecting wanted clients, thus reducing effectiveness. e.g., the accuracy increase brought by the state-of-the-art client-selection algorithm drops by 73.9%. (5) Heterogeneity causes the optimal FL hyper-parameters to drift significantly. More specifically, the heterogeneity-unaware setting favors looser deadline and higher reporting fraction to achieve better training performance. (6) Heterogeneity results in non-trivial failed clients (more than 10%) and leads to participation bias (the top 30% of clients contribute 86% of computations). Our FLASH platform and data have been publicly open sourced
Resource and Heterogeneity-aware Clients Eligibility Protocol in Federated Learning
Federated Learning (FL) is a new paradigm of Machine Learning (ML) that enables on-device computation via decentralized data training. However, traditional FL algorithms impose strict requirements on the clients\u27 selection and its ratio. Moreover, the data training becomes inefficient when the client\u27s computational resources are limited. Towards this goal, we aim to extend FL, a decentralized learning framework that efficiently works with heterogeneous clients in practical industrial scenarios. To this end, we propose a Clients\u27 Eligibility Protocol (CEP), a resource-aware FL solution, for a heterogeneous environment. To this end, we use a Trusted Authority (TA) between the clients and the cloud server, which calculates the client\u27s eligibility score based on local computing resources such as bandwidth, memory, and battery life and selects the most resourceful clients for training. If a client gives a slow response or infuses an incorrect model, the TA declares that the client is ineligible for future training. Besides, the proposed CEP leverages the asynchronous FL model, which avoids a long delay in a client\u27s response. The empirical results proves that the proposed CEP gains the benefits of resource-aware clients selection and achieves 88 % and 93 % of accuracy on AlexNet and LeNet, respectively
FedCompass: Efficient Cross-Silo Federated Learning on Heterogeneous Client Devices using a Computing Power Aware Scheduler
Cross-silo federated learning offers a promising solution to collaboratively
train robust and generalized AI models without compromising the privacy of
local datasets, e.g., healthcare, financial, as well as scientific projects
that lack a centralized data facility. Nonetheless, because of the disparity of
computing resources among different clients (i.e., device heterogeneity),
synchronous federated learning algorithms suffer from degraded efficiency when
waiting for straggler clients. Similarly, asynchronous federated learning
algorithms experience degradation in the convergence rate and final model
accuracy on non-identically and independently distributed (non-IID)
heterogeneous datasets due to stale local models and client drift. To address
these limitations in cross-silo federated learning with heterogeneous clients
and data, we propose FedCompass, an innovative semi-asynchronous federated
learning algorithm with a computing power aware scheduler on the server side,
which adaptively assigns varying amounts of training tasks to different clients
using the knowledge of the computing power of individual clients. FedCompass
ensures that multiple locally trained models from clients are received almost
simultaneously as a group for aggregation, effectively reducing the staleness
of local models. At the same time, the overall training process remains
asynchronous, eliminating prolonged waiting periods from straggler clients.
Using diverse non-IID heterogeneous distributed datasets, we demonstrate that
FedCompass achieves faster convergence and higher accuracy than other
asynchronous algorithms while remaining more efficient than synchronous
algorithms when performing federated learning on heterogeneous clients
Knowledge-Aware Federated Active Learning with Non-IID Data
Federated learning enables multiple decentralized clients to learn
collaboratively without sharing the local training data. However, the expensive
annotation cost to acquire data labels on local clients remains an obstacle in
utilizing local data. In this paper, we propose a federated active learning
paradigm to efficiently learn a global model with limited annotation budget
while protecting data privacy in a decentralized learning way. The main
challenge faced by federated active learning is the mismatch between the active
sampling goal of the global model on the server and that of the asynchronous
local clients. This becomes even more significant when data is distributed
non-IID across local clients. To address the aforementioned challenge, we
propose Knowledge-Aware Federated Active Learning (KAFAL), which consists of
Knowledge-Specialized Active Sampling (KSAS) and Knowledge-Compensatory
Federated Update (KCFU). KSAS is a novel active sampling method tailored for
the federated active learning problem. It deals with the mismatch challenge by
sampling actively based on the discrepancies between local and global models.
KSAS intensifies specialized knowledge in local clients, ensuring the sampled
data to be informative for both the local clients and the global model. KCFU,
in the meantime, deals with the client heterogeneity caused by limited data and
non-IID data distributions. It compensates for each client's ability in weak
classes by the assistance of the global model. Extensive experiments and
analyses are conducted to show the superiority of KSAS over the
state-of-the-art active learning methods and the efficiency of KCFU under the
federated active learning framework.Comment: 14 pages, 12 figure
Dynamic Federated Learning for Heterogeneous Learning Environments
The emergence of the Internet of Things (IoT) has resulted in a massive influx of data generated by various edge devices. Machine learning models trained on this data can provide valuable insights and predictions, leading to better decision-making and intelligent applications. Federated Learning (FL) is a distributed learning paradigm that enables remote devices to collaboratively train models without sharing sensitive data, thus preserving user privacy and reducing communication overhead. However, despite recent breakthroughs in FL, the heterogeneous learning environments significantly limit its performance and hinder its real-world applications. The heterogeneous learning environment is mainly embodied in two aspects. Firstly, the statistically heterogeneous (usually non-independent identically distributed) data from geographically distributed clients can deteriorate the FL training accuracy. Secondly, the heterogeneous computing and communication resources in IoT devices often result in unstable training processes that slow down the training of a global model and affect energy consumption. Most existing studies address only the unilateral side of the heterogeneity issue, either the statistical or the resource heterogeneity. However, the resource heterogeneity among various devices does not necessarily correlate with the distribution of their training data. We propose Dynamic Federated Learning (DFL) to address the joint problem of data and resource heterogeneity in FL. DFL combines resource-aware split computing of deep neural networks and dynamic clustering of training participants based on the similarity of their sub-model layers. Using resource-aware split learning, the allocation of the FL training tasks on resource-constrained participants is adjusted to match their heterogeneous computing capabilities, while resource-capable participants carry out the classic FL training. We employ centered kernel alignment for determining the similarity of neural network layers to address the data heterogeneity and carry out layerwise sub-model aggregation. Preliminary results indicate that the proposed technique can improve training performance (i.e., training time, accuracy, and energy consumption) in heterogeneous learning environments with both data and resource heterogeneity
RHFedMTL: Resource-Aware Hierarchical Federated Multi-Task Learning
The rapid development of artificial intelligence (AI) over massive
applications including Internet-of-things on cellular network raises the
concern of technical challenges such as privacy, heterogeneity and resource
efficiency.
Federated learning is an effective way to enable AI over massive distributed
nodes with security.
However, conventional works mostly focus on learning a single global model
for a unique task across the network, and are generally less competent to
handle multi-task learning (MTL) scenarios with stragglers at the expense of
acceptable computation and communication cost. Meanwhile, it is challenging to
ensure the privacy while maintain a coupled multi-task learning across multiple
base stations (BSs) and terminals. In this paper, inspired by the natural
cloud-BS-terminal hierarchy of cellular works, we provide a viable
resource-aware hierarchical federated MTL (RHFedMTL) solution to meet the
heterogeneity of tasks, by solving different tasks within the BSs and
aggregating the multi-task result in the cloud without compromising the
privacy. Specifically, a primal-dual method has been leveraged to effectively
transform the coupled MTL into some local optimization sub-problems within BSs.
Furthermore, compared with existing methods to reduce resource cost by simply
changing the aggregation frequency,
we dive into the intricate relationship between resource consumption and
learning accuracy, and develop a resource-aware learning strategy for local
terminals and BSs to meet the resource budget. Extensive simulation results
demonstrate the effectiveness and superiority of RHFedMTL in terms of improving
the learning accuracy and boosting the convergence rate.Comment: 11 pages, 8 figure
Federated Edge Learning : Design Issues and Challenges
Federated Learning (FL) is a distributed machine learning technique, where
each device contributes to the learning model by independently computing the
gradient based on its local training data. It has recently become a hot
research topic, as it promises several benefits related to data privacy and
scalability. However, implementing FL at the network edge is challenging due to
system and data heterogeneity and resources constraints. In this article, we
examine the existing challenges and trade-offs in Federated Edge Learning
(FEEL). The design of FEEL algorithms for resources-efficient learning raises
several challenges. These challenges are essentially related to the
multidisciplinary nature of the problem. As the data is the key component of
the learning, this article advocates a new set of considerations for data
characteristics in wireless scheduling algorithms in FEEL. Hence, we propose a
general framework for the data-aware scheduling as a guideline for future
research directions. We also discuss the main axes and requirements for data
evaluation and some exploitable techniques and metrics.Comment: Submitted to IEEE Network Magazin
Federated Embedded Systems – a review of the literature in related fields
This report is concerned with the vision of smart interconnected objects, a vision that has attracted much attention lately. In this paper, embedded, interconnected, open, and heterogeneous control systems are in focus, formally referred to as Federated Embedded Systems. To place FES into a context, a review of some related research directions is presented. This review includes such concepts as systems of systems, cyber-physical systems, ubiquitous
computing, internet of things, and multi-agent systems. Interestingly, the reviewed fields seem to overlap with each other in an increasing number of ways
FedDisco: Federated Learning with Discrepancy-Aware Collaboration
This work considers the category distribution heterogeneity in federated
learning. This issue is due to biased labeling preferences at multiple clients
and is a typical setting of data heterogeneity. To alleviate this issue, most
previous works consider either regularizing local models or fine-tuning the
global model, while they ignore the adjustment of aggregation weights and
simply assign weights based on the dataset size. However, based on our
empirical observations and theoretical analysis, we find that the dataset size
is not optimal and the discrepancy between local and global category
distributions could be a beneficial and complementary indicator for determining
aggregation weights. We thus propose a novel aggregation method, Federated
Learning with Discrepancy-aware Collaboration (FedDisco), whose aggregation
weights not only involve both the dataset size and the discrepancy value, but
also contribute to a tighter theoretical upper bound of the optimization error.
FedDisco also promotes privacy-preservation, communication and computation
efficiency, as well as modularity. Extensive experiments show that our FedDisco
outperforms several state-of-the-art methods and can be easily incorporated
with many existing methods to further enhance the performance. Our code will be
available at https://github.com/MediaBrain-SJTU/FedDisco.Comment: Accepted by International Conference on Machine Learning (ICML2023
The Gradient Convergence Bound of Federated Multi-Agent Reinforcement Learning with Efficient Communication
The paper considers a distributed version of deep reinforcement learning
(DRL) for multi-agent decision-making process in the paradigm of federated
learning. Since the deep neural network models in federated learning are
trained locally and aggregated iteratively through a central server, frequent
information exchange incurs a large amount of communication overheads. Besides,
due to the heterogeneity of agents, Markov state transition trajectories from
different agents are usually unsynchronized within the same time interval,
which will further influence the convergence bound of the aggregated deep
neural network models. Therefore, it is of vital importance to reasonably
evaluate the effectiveness of different optimization methods. Accordingly, this
paper proposes a utility function to consider the balance between reducing
communication overheads and improving convergence performance. Meanwhile, this
paper develops two new optimization methods on top of variation-aware periodic
averaging methods: 1) the decay-based method which gradually decreases the
weight of the model's local gradients within the progress of local updating,
and 2) the consensus-based method which introduces the consensus algorithm into
federated learning for the exchange of the model's local gradients. This paper
also provides novel convergence guarantees for both developed methods and
demonstrates their effectiveness and efficiency through theoretical analysis
and numerical simulation results
- …