Distributed learning on the edge often comprises self-centered devices (SCD)
which learn local tasks independently and are unwilling to contribute to the
performance of other SDCs. How do we achieve forward transfer at zero cost for
the single SCDs? We formalize this problem as a Distributed Continual Learning
scenario, where SCD adapt to local tasks and a CL model consolidates the
knowledge from the resulting stream of models without looking at the SCD's
private data. Unfortunately, current CL methods are not directly applicable to
this scenario. We propose Data-Agnostic Consolidation (DAC), a novel double
knowledge distillation method that consolidates the stream of SC models without
using the original data. DAC performs distillation in the latent space via a
novel Projected Latent Distillation loss. Experimental results show that DAC
enables forward transfer between SCDs and reaches state-of-the-art accuracy on
Split CIFAR100, CORe50 and Split TinyImageNet, both in reharsal-free and
distributed CL scenarios. Somewhat surprisingly, even a single
out-of-distribution image is sufficient as the only source of data during
consolidation