171 research outputs found
Asynchronous Federated Learning via Over-the-Air Computation
The emerging field of federated learning (FL) provides great potential for edge intelligence while protecting data privacy. However, as the system grows in scale or becomes more heterogeneous, new challenges, such as the spectrum shortage and stragglers issues, arise. These issues can potentially be addressed by over-the-air computation (AirComp) and asynchronous FL, respectively, however, their combination is difficult due to their conflicting requirements. In this paper, we propose a novel asynchronous FL with AirComp in a time-triggered manner (async-AirFed). The conventional async aggregation requests the historical data to be used for model updates, which can cause the accumulation of channel noise and interference when AirComp is applied. To address this issue, we propose a simple but effective truncation method which retains a limited length of historical data. Convergence analysis presents that our proposed async-AirFed converges on non-convex optimality function with sub-linear rate. Simulation results show that our proposed scheme achieves more than 34% faster convergence than the benchmarks, by achieving an accuracy of 85%, which also improves the time utilization efficiency and reduces the impact of staleness and the channel
Channel-Driven Monte Carlo Sampling for Bayesian Distributed Learning in Wireless Data Centers
Conventional frequentist learning, as assumed by existing federated learning
protocols, is limited in its ability to quantify uncertainty, incorporate prior
knowledge, guide active learning, and enable continual learning. Bayesian
learning provides a principled approach to address all these limitations, at
the cost of an increase in computational complexity. This paper studies
distributed Bayesian learning in a wireless data center setting encompassing a
central server and multiple distributed workers. Prior work on wireless
distributed learning has focused exclusively on frequentist learning, and has
introduced the idea of leveraging uncoded transmission to enable "over-the-air"
computing. Unlike frequentist learning, Bayesian learning aims at evaluating
approximations or samples from a global posterior distribution in the model
parameter space. This work investigates for the first time the design of
distributed one-shot, or "embarrassingly parallel", Bayesian learning protocols
in wireless data centers via consensus Monte Carlo (CMC). Uncoded transmission
is introduced not only as a way to implement "over-the-air" computing, but also
as a mechanism to deploy channel-driven MC sampling: Rather than treating
channel noise as a nuisance to be mitigated, channel-driven sampling utilizes
channel noise as an integral part of the MC sampling process. A simple wireless
CMC scheme is first proposed that is asymptotically optimal under Gaussian
local posteriors. Then, for arbitrary local posteriors, a variational
optimization strategy is introduced. Simulation results demonstrate that, if
properly accounted for, channel noise can indeed contribute to MC sampling and
does not necessarily decrease the accuracy level.Comment: Under Revisio
Over-The-Air Federated Learning Over Scalable Cell-free Massive MIMO
Cell-free massive MIMO is emerging as a promising technology for future
wireless communication systems, which is expected to offer uniform coverage and
high spectral efficiency compared to classical cellular systems. We study in
this paper how cell-free massive MIMO can support federated edge learning.
Taking advantage of the additive nature of the wireless multiple access
channel, over-the-air computation is exploited, where the clients send their
local updates simultaneously over the same communication resource. This
approach, known as over-the-air federated learning (OTA-FL), is proven to
alleviate the communication overhead of federated learning over wireless
networks. Considering channel correlation and only imperfect channel state
information available at the central server, we propose a practical
implementation of OTA-FL over cell-free massive MIMO. The convergence of the
proposed implementation is studied analytically and experimentally, confirming
the benefits of cell-free massive MIMO for OTA-FL
How Robust is Federated Learning to Communication Error? A Comparison Study Between Uplink and Downlink Channels
Because of its privacy-preserving capability, federated learning (FL) has
attracted significant attention from both academia and industry. However, when
being implemented over wireless networks, it is not clear how much
communication error can be tolerated by FL. This paper investigates the
robustness of FL to the uplink and downlink communication error. Our
theoretical analysis reveals that the robustness depends on two critical
parameters, namely the number of clients and the numerical range of model
parameters. It is also shown that the uplink communication in FL can tolerate a
higher bit error rate (BER) than downlink communication, and this difference is
quantified by a proposed formula. The findings and theoretical analyses are
further validated by extensive experiments.Comment: Submitted to IEEE for possible publicatio
Imperfect CSI: A Key Factor of Uncertainty to Over-the-Air Federated Learning
Over-the-air computation (AirComp) has recently been identified as a
prominent technique to enhance communication efficiency of wireless federated
learning (FL). This letter investigates the impact of channel state information
(CSI) uncertainty at the transmitter on an AirComp enabled FL (AirFL) system
with the truncated channel inversion strategy. To characterize the performance
of the AirFL system, the weight divergence with respect to the ideal
aggregation is analytically derived to evaluate learning performance loss. We
explicitly reveal that the weight divergence deteriorates as
as the level of channel estimation accuracy
vanishes, and also has a decay rate of with the increasing
number of participating devices, . Building upon our analytical results, we
formulate the channel truncation threshold optimization problem to adapt to
different , which can be solved optimally. Numerical results verify the
analytical results and show that a lower truncation threshold is preferred with
more accurate CSI.Comment: Submitted to IEEE for possible publicatio
- …