5,960 research outputs found

    FLASH: Heterogeneity-Aware Federated Learning at Scale

    Get PDF
    Federated learning (FL) becomes a promising machine learning paradigm. The impact of heterogeneous hardware specifications and dynamic states on the FL process has not yet been studied systematically. This paper presents the first large-scale study of this impact based on real-world data collected from 136k smartphones. We conducted extensive experiments on our proposed heterogeneity-aware FL platform namely FLASH , to systematically explore the performance of state-of-the-art FL algorithms and key FL configurations in heterogeneity-aware and -unaware settings, finding the following. (1) Heterogeneity causes accuracy to drop by up to 9.2% and convergence time to increase by 2.32×. (2) Heterogeneity negatively impacts popular aggregation algorithms, e.g., the accuracy variance reduction brought by q-FedAvg drops by 17.5%. (3) Heterogeneity does not worsen the accuracy loss caused by gradient-compression algorithms significantly, but it compromises the convergence time by up to 2.5×. (4) Heterogeneity hinders client-selection algorithms from selecting wanted clients, thus reducing effectiveness. e.g., the accuracy increase brought by the state-of-the-art client-selection algorithm drops by 73.9%. (5) Heterogeneity causes the optimal FL hyper-parameters to drift significantly. More specifically, the heterogeneity-unaware setting favors looser deadline and higher reporting fraction to achieve better training performance. (6) Heterogeneity results in non-trivial failed clients (more than 10%) and leads to participation bias (the top 30% of clients contribute 86% of computations). Our FLASH platform and data have been publicly open sourced

    Resource and Heterogeneity-aware Clients Eligibility Protocol in Federated Learning

    Get PDF
    Federated Learning (FL) is a new paradigm of Machine Learning (ML) that enables on-device computation via decentralized data training. However, traditional FL algorithms impose strict requirements on the clients\u27 selection and its ratio. Moreover, the data training becomes inefficient when the client\u27s computational resources are limited. Towards this goal, we aim to extend FL, a decentralized learning framework that efficiently works with heterogeneous clients in practical industrial scenarios. To this end, we propose a Clients\u27 Eligibility Protocol (CEP), a resource-aware FL solution, for a heterogeneous environment. To this end, we use a Trusted Authority (TA) between the clients and the cloud server, which calculates the client\u27s eligibility score based on local computing resources such as bandwidth, memory, and battery life and selects the most resourceful clients for training. If a client gives a slow response or infuses an incorrect model, the TA declares that the client is ineligible for future training. Besides, the proposed CEP leverages the asynchronous FL model, which avoids a long delay in a client\u27s response. The empirical results proves that the proposed CEP gains the benefits of resource-aware clients selection and achieves 88 % and 93 % of accuracy on AlexNet and LeNet, respectively

    FedCompass: Efficient Cross-Silo Federated Learning on Heterogeneous Client Devices using a Computing Power Aware Scheduler

    Full text link
    Cross-silo federated learning offers a promising solution to collaboratively train robust and generalized AI models without compromising the privacy of local datasets, e.g., healthcare, financial, as well as scientific projects that lack a centralized data facility. Nonetheless, because of the disparity of computing resources among different clients (i.e., device heterogeneity), synchronous federated learning algorithms suffer from degraded efficiency when waiting for straggler clients. Similarly, asynchronous federated learning algorithms experience degradation in the convergence rate and final model accuracy on non-identically and independently distributed (non-IID) heterogeneous datasets due to stale local models and client drift. To address these limitations in cross-silo federated learning with heterogeneous clients and data, we propose FedCompass, an innovative semi-asynchronous federated learning algorithm with a computing power aware scheduler on the server side, which adaptively assigns varying amounts of training tasks to different clients using the knowledge of the computing power of individual clients. FedCompass ensures that multiple locally trained models from clients are received almost simultaneously as a group for aggregation, effectively reducing the staleness of local models. At the same time, the overall training process remains asynchronous, eliminating prolonged waiting periods from straggler clients. Using diverse non-IID heterogeneous distributed datasets, we demonstrate that FedCompass achieves faster convergence and higher accuracy than other asynchronous algorithms while remaining more efficient than synchronous algorithms when performing federated learning on heterogeneous clients

    Knowledge-Aware Federated Active Learning with Non-IID Data

    Full text link
    Federated learning enables multiple decentralized clients to learn collaboratively without sharing the local training data. However, the expensive annotation cost to acquire data labels on local clients remains an obstacle in utilizing local data. In this paper, we propose a federated active learning paradigm to efficiently learn a global model with limited annotation budget while protecting data privacy in a decentralized learning way. The main challenge faced by federated active learning is the mismatch between the active sampling goal of the global model on the server and that of the asynchronous local clients. This becomes even more significant when data is distributed non-IID across local clients. To address the aforementioned challenge, we propose Knowledge-Aware Federated Active Learning (KAFAL), which consists of Knowledge-Specialized Active Sampling (KSAS) and Knowledge-Compensatory Federated Update (KCFU). KSAS is a novel active sampling method tailored for the federated active learning problem. It deals with the mismatch challenge by sampling actively based on the discrepancies between local and global models. KSAS intensifies specialized knowledge in local clients, ensuring the sampled data to be informative for both the local clients and the global model. KCFU, in the meantime, deals with the client heterogeneity caused by limited data and non-IID data distributions. It compensates for each client's ability in weak classes by the assistance of the global model. Extensive experiments and analyses are conducted to show the superiority of KSAS over the state-of-the-art active learning methods and the efficiency of KCFU under the federated active learning framework.Comment: 14 pages, 12 figure

    Dynamic Federated Learning for Heterogeneous Learning Environments

    Get PDF
    The emergence of the Internet of Things (IoT) has resulted in a massive influx of data generated by various edge devices. Machine learning models trained on this data can provide valuable insights and predictions, leading to better decision-making and intelligent applications. Federated Learning (FL) is a distributed learning paradigm that enables remote devices to collaboratively train models without sharing sensitive data, thus preserving user privacy and reducing communication overhead. However, despite recent breakthroughs in FL, the heterogeneous learning environments significantly limit its performance and hinder its real-world applications. The heterogeneous learning environment is mainly embodied in two aspects. Firstly, the statistically heterogeneous (usually non-independent identically distributed) data from geographically distributed clients can deteriorate the FL training accuracy. Secondly, the heterogeneous computing and communication resources in IoT devices often result in unstable training processes that slow down the training of a global model and affect energy consumption. Most existing studies address only the unilateral side of the heterogeneity issue, either the statistical or the resource heterogeneity. However, the resource heterogeneity among various devices does not necessarily correlate with the distribution of their training data. We propose Dynamic Federated Learning (DFL) to address the joint problem of data and resource heterogeneity in FL. DFL combines resource-aware split computing of deep neural networks and dynamic clustering of training participants based on the similarity of their sub-model layers. Using resource-aware split learning, the allocation of the FL training tasks on resource-constrained participants is adjusted to match their heterogeneous computing capabilities, while resource-capable participants carry out the classic FL training. We employ centered kernel alignment for determining the similarity of neural network layers to address the data heterogeneity and carry out layerwise sub-model aggregation. Preliminary results indicate that the proposed technique can improve training performance (i.e., training time, accuracy, and energy consumption) in heterogeneous learning environments with both data and resource heterogeneity

    RHFedMTL: Resource-Aware Hierarchical Federated Multi-Task Learning

    Full text link
    The rapid development of artificial intelligence (AI) over massive applications including Internet-of-things on cellular network raises the concern of technical challenges such as privacy, heterogeneity and resource efficiency. Federated learning is an effective way to enable AI over massive distributed nodes with security. However, conventional works mostly focus on learning a single global model for a unique task across the network, and are generally less competent to handle multi-task learning (MTL) scenarios with stragglers at the expense of acceptable computation and communication cost. Meanwhile, it is challenging to ensure the privacy while maintain a coupled multi-task learning across multiple base stations (BSs) and terminals. In this paper, inspired by the natural cloud-BS-terminal hierarchy of cellular works, we provide a viable resource-aware hierarchical federated MTL (RHFedMTL) solution to meet the heterogeneity of tasks, by solving different tasks within the BSs and aggregating the multi-task result in the cloud without compromising the privacy. Specifically, a primal-dual method has been leveraged to effectively transform the coupled MTL into some local optimization sub-problems within BSs. Furthermore, compared with existing methods to reduce resource cost by simply changing the aggregation frequency, we dive into the intricate relationship between resource consumption and learning accuracy, and develop a resource-aware learning strategy for local terminals and BSs to meet the resource budget. Extensive simulation results demonstrate the effectiveness and superiority of RHFedMTL in terms of improving the learning accuracy and boosting the convergence rate.Comment: 11 pages, 8 figure

    Federated Edge Learning : Design Issues and Challenges

    Full text link
    Federated Learning (FL) is a distributed machine learning technique, where each device contributes to the learning model by independently computing the gradient based on its local training data. It has recently become a hot research topic, as it promises several benefits related to data privacy and scalability. However, implementing FL at the network edge is challenging due to system and data heterogeneity and resources constraints. In this article, we examine the existing challenges and trade-offs in Federated Edge Learning (FEEL). The design of FEEL algorithms for resources-efficient learning raises several challenges. These challenges are essentially related to the multidisciplinary nature of the problem. As the data is the key component of the learning, this article advocates a new set of considerations for data characteristics in wireless scheduling algorithms in FEEL. Hence, we propose a general framework for the data-aware scheduling as a guideline for future research directions. We also discuss the main axes and requirements for data evaluation and some exploitable techniques and metrics.Comment: Submitted to IEEE Network Magazin

    Federated Embedded Systems – a review of the literature in related fields

    Get PDF
    This report is concerned with the vision of smart interconnected objects, a vision that has attracted much attention lately. In this paper, embedded, interconnected, open, and heterogeneous control systems are in focus, formally referred to as Federated Embedded Systems. To place FES into a context, a review of some related research directions is presented. This review includes such concepts as systems of systems, cyber-physical systems, ubiquitous computing, internet of things, and multi-agent systems. Interestingly, the reviewed fields seem to overlap with each other in an increasing number of ways

    FedDisco: Federated Learning with Discrepancy-Aware Collaboration

    Full text link
    This work considers the category distribution heterogeneity in federated learning. This issue is due to biased labeling preferences at multiple clients and is a typical setting of data heterogeneity. To alleviate this issue, most previous works consider either regularizing local models or fine-tuning the global model, while they ignore the adjustment of aggregation weights and simply assign weights based on the dataset size. However, based on our empirical observations and theoretical analysis, we find that the dataset size is not optimal and the discrepancy between local and global category distributions could be a beneficial and complementary indicator for determining aggregation weights. We thus propose a novel aggregation method, Federated Learning with Discrepancy-aware Collaboration (FedDisco), whose aggregation weights not only involve both the dataset size and the discrepancy value, but also contribute to a tighter theoretical upper bound of the optimization error. FedDisco also promotes privacy-preservation, communication and computation efficiency, as well as modularity. Extensive experiments show that our FedDisco outperforms several state-of-the-art methods and can be easily incorporated with many existing methods to further enhance the performance. Our code will be available at https://github.com/MediaBrain-SJTU/FedDisco.Comment: Accepted by International Conference on Machine Learning (ICML2023

    The Gradient Convergence Bound of Federated Multi-Agent Reinforcement Learning with Efficient Communication

    Full text link
    The paper considers a distributed version of deep reinforcement learning (DRL) for multi-agent decision-making process in the paradigm of federated learning. Since the deep neural network models in federated learning are trained locally and aggregated iteratively through a central server, frequent information exchange incurs a large amount of communication overheads. Besides, due to the heterogeneity of agents, Markov state transition trajectories from different agents are usually unsynchronized within the same time interval, which will further influence the convergence bound of the aggregated deep neural network models. Therefore, it is of vital importance to reasonably evaluate the effectiveness of different optimization methods. Accordingly, this paper proposes a utility function to consider the balance between reducing communication overheads and improving convergence performance. Meanwhile, this paper develops two new optimization methods on top of variation-aware periodic averaging methods: 1) the decay-based method which gradually decreases the weight of the model's local gradients within the progress of local updating, and 2) the consensus-based method which introduces the consensus algorithm into federated learning for the exchange of the model's local gradients. This paper also provides novel convergence guarantees for both developed methods and demonstrates their effectiveness and efficiency through theoretical analysis and numerical simulation results
    • …
    corecore