The Federated Learning (FL) paradigm is known to face challenges under
heterogeneous client data. Local training on non-iid distributed data results
in deflected local optimum, which causes the client models drift further away
from each other and degrades the aggregated global model's performance. A
natural solution is to gather all client data onto the server, such that the
server has a global view of the entire data distribution. Unfortunately, this
reduces to regular training, which compromises clients' privacy and conflicts
with the purpose of FL. In this paper, we put forth an idea to collect and
leverage global knowledge on the server without hindering data privacy. We
unearth such knowledge from the dynamics of the global model's trajectory.
Specifically, we first reserve a short trajectory of global model snapshots on
the server. Then, we synthesize a small pseudo dataset such that the model
trained on it mimics the dynamics of the reserved global model trajectory.
Afterward, the synthesized data is used to help aggregate the deflected clients
into the global model. We name our method Dynafed, which enjoys the following
advantages: 1) we do not rely on any external on-server dataset, which requires
no additional cost for data collection; 2) the pseudo data can be synthesized
in early communication rounds, which enables Dynafed to take effect early for
boosting the convergence and stabilizing training; 3) the pseudo data only
needs to be synthesized once and can be directly utilized on the server to help
aggregation in subsequent rounds. Experiments across extensive benchmarks are
conducted to showcase the effectiveness of Dynafed. We also provide insights
and understanding of the underlying mechanism of our method