171 research outputs found

    Asynchronous Federated Learning via Over-the-Air Computation

    Get PDF
    The emerging field of federated learning (FL) provides great potential for edge intelligence while protecting data privacy. However, as the system grows in scale or becomes more heterogeneous, new challenges, such as the spectrum shortage and stragglers issues, arise. These issues can potentially be addressed by over-the-air computation (AirComp) and asynchronous FL, respectively, however, their combination is difficult due to their conflicting requirements. In this paper, we propose a novel asynchronous FL with AirComp in a time-triggered manner (async-AirFed). The conventional async aggregation requests the historical data to be used for model updates, which can cause the accumulation of channel noise and interference when AirComp is applied. To address this issue, we propose a simple but effective truncation method which retains a limited length of historical data. Convergence analysis presents that our proposed async-AirFed converges on non-convex optimality function with sub-linear rate. Simulation results show that our proposed scheme achieves more than 34% faster convergence than the benchmarks, by achieving an accuracy of 85%, which also improves the time utilization efficiency and reduces the impact of staleness and the channel

    Channel-Driven Monte Carlo Sampling for Bayesian Distributed Learning in Wireless Data Centers

    Get PDF
    Conventional frequentist learning, as assumed by existing federated learning protocols, is limited in its ability to quantify uncertainty, incorporate prior knowledge, guide active learning, and enable continual learning. Bayesian learning provides a principled approach to address all these limitations, at the cost of an increase in computational complexity. This paper studies distributed Bayesian learning in a wireless data center setting encompassing a central server and multiple distributed workers. Prior work on wireless distributed learning has focused exclusively on frequentist learning, and has introduced the idea of leveraging uncoded transmission to enable "over-the-air" computing. Unlike frequentist learning, Bayesian learning aims at evaluating approximations or samples from a global posterior distribution in the model parameter space. This work investigates for the first time the design of distributed one-shot, or "embarrassingly parallel", Bayesian learning protocols in wireless data centers via consensus Monte Carlo (CMC). Uncoded transmission is introduced not only as a way to implement "over-the-air" computing, but also as a mechanism to deploy channel-driven MC sampling: Rather than treating channel noise as a nuisance to be mitigated, channel-driven sampling utilizes channel noise as an integral part of the MC sampling process. A simple wireless CMC scheme is first proposed that is asymptotically optimal under Gaussian local posteriors. Then, for arbitrary local posteriors, a variational optimization strategy is introduced. Simulation results demonstrate that, if properly accounted for, channel noise can indeed contribute to MC sampling and does not necessarily decrease the accuracy level.Comment: Under Revisio

    Over-The-Air Federated Learning Over Scalable Cell-free Massive MIMO

    Full text link
    Cell-free massive MIMO is emerging as a promising technology for future wireless communication systems, which is expected to offer uniform coverage and high spectral efficiency compared to classical cellular systems. We study in this paper how cell-free massive MIMO can support federated edge learning. Taking advantage of the additive nature of the wireless multiple access channel, over-the-air computation is exploited, where the clients send their local updates simultaneously over the same communication resource. This approach, known as over-the-air federated learning (OTA-FL), is proven to alleviate the communication overhead of federated learning over wireless networks. Considering channel correlation and only imperfect channel state information available at the central server, we propose a practical implementation of OTA-FL over cell-free massive MIMO. The convergence of the proposed implementation is studied analytically and experimentally, confirming the benefits of cell-free massive MIMO for OTA-FL

    How Robust is Federated Learning to Communication Error? A Comparison Study Between Uplink and Downlink Channels

    Full text link
    Because of its privacy-preserving capability, federated learning (FL) has attracted significant attention from both academia and industry. However, when being implemented over wireless networks, it is not clear how much communication error can be tolerated by FL. This paper investigates the robustness of FL to the uplink and downlink communication error. Our theoretical analysis reveals that the robustness depends on two critical parameters, namely the number of clients and the numerical range of model parameters. It is also shown that the uplink communication in FL can tolerate a higher bit error rate (BER) than downlink communication, and this difference is quantified by a proposed formula. The findings and theoretical analyses are further validated by extensive experiments.Comment: Submitted to IEEE for possible publicatio

    Imperfect CSI: A Key Factor of Uncertainty to Over-the-Air Federated Learning

    Full text link
    Over-the-air computation (AirComp) has recently been identified as a prominent technique to enhance communication efficiency of wireless federated learning (FL). This letter investigates the impact of channel state information (CSI) uncertainty at the transmitter on an AirComp enabled FL (AirFL) system with the truncated channel inversion strategy. To characterize the performance of the AirFL system, the weight divergence with respect to the ideal aggregation is analytically derived to evaluate learning performance loss. We explicitly reveal that the weight divergence deteriorates as O(1/ρ2)\mathcal{O}(1/\rho^2) as the level of channel estimation accuracy ρ\rho vanishes, and also has a decay rate of O(1/K2)\mathcal{O}(1/K^2) with the increasing number of participating devices, KK. Building upon our analytical results, we formulate the channel truncation threshold optimization problem to adapt to different ρ\rho, which can be solved optimally. Numerical results verify the analytical results and show that a lower truncation threshold is preferred with more accurate CSI.Comment: Submitted to IEEE for possible publicatio
    corecore