One of the main challenges of decentralized machine learning paradigms such
as Federated Learning (FL) is the presence of local non-i.i.d. datasets.
Device-to-device transfers (D2D) between distributed devices has been shown to
be an effective tool for dealing with this problem and robust to stragglers. In
an unsupervised case, however, it is not obvious how data exchanges should take
place due to the absence of labels. In this paper, we propose an approach to
create an optimal graph for data transfer using Reinforcement Learning. The
goal is to form links that will provide the most benefit considering the
environment's constraints and improve convergence speed in an unsupervised FL
environment. Numerical analysis shows the advantages in terms of convergence
speed and straggler resilience of the proposed method to different available FL
schemes and benchmark datasets