To mitigate the privacy leakages and communication burdens of Federated
Learning (FL), decentralized FL (DFL) discards the central server and each
client only communicates with its neighbors in a decentralized communication
network. However, existing DFL suffers from high inconsistency among local
clients, which results in severe distribution shift and inferior performance
compared with centralized FL (CFL), especially on heterogeneous data or sparse
communication topology. To alleviate this issue, we propose two DFL algorithms
named DFedSAM and DFedSAM-MGS to improve the performance of DFL. Specifically,
DFedSAM leverages gradient perturbation to generate local flat models via
Sharpness Aware Minimization (SAM), which searches for models with uniformly
low loss values. DFedSAM-MGS further boosts DFedSAM by adopting Multiple Gossip
Steps (MGS) for better model consistency, which accelerates the aggregation of
local flat models and better balances communication complexity and
generalization. Theoretically, we present improved convergence rates O(KT​1​+T1​+K1/2T3/2(1−λ)21​)
and O(KT​1​+T1​+K1/2T3/2(1−λQ)2λQ+1​)
in non-convex setting for DFedSAM and DFedSAM-MGS, respectively, where
1−λ is the spectral gap of gossip matrix and Q is the number of MGS.
Empirically, our methods can achieve competitive performance compared with CFL
methods and outperform existing DFL methods.Comment: ICML202