The ubiquity of camera-enabled devices has led to large amounts of unlabeled
image data being produced at the edge. The integration of self-supervised
learning (SSL) and federated learning (FL) into one coherent system can
potentially offer data privacy guarantees while also advancing the quality and
robustness of the learned visual representations without needing to move data
around. However, client bias and divergence during FL aggregation caused by
data heterogeneity limits the performance of learned visual representations on
downstream tasks. In this paper, we propose a new aggregation strategy termed
Layer-wise Divergence Aware Weight Aggregation (L-DAWA) to mitigate the
influence of client bias and divergence during FL aggregation. The proposed
method aggregates weights at the layer-level according to the measure of
angular divergence between the clients' model and the global model. Extensive
experiments with cross-silo and cross-device settings on CIFAR-10/100 and Tiny
ImageNet datasets demonstrate that our methods are effective and obtain new
SOTA performance on both contrastive and non-contrastive SSL approaches