Federated Learning (FL) is an emerging machine learning technique that
enables distributed model training across data silos or edge devices without
data sharing. Yet, FL inevitably introduces inefficiencies compared to
centralized model training, which will further increase the already high energy
usage and associated carbon emissions of machine learning in the future.
Although the scheduling of workloads based on the availability of low-carbon
energy has received considerable attention in recent years, it has not yet been
investigated in the context of FL. However, FL is a highly promising use case
for carbon-aware computing, as training jobs constitute of energy-intensive
batch processes scheduled in geo-distributed environments.
We propose FedZero, a FL system that operates exclusively on renewable excess
energy and spare capacity of compute infrastructure to effectively reduce the
training's operational carbon emissions to zero. Based on energy and load
forecasts, FedZero leverages the spatio-temporal availability of excess energy
by cherry-picking clients for fast convergence and fair participation. Our
evaluation, based on real solar and load traces, shows that FedZero converges
considerably faster under the mentioned constraints than state-of-the-art
approaches, is highly scalable, and is robust against forecasting errors