Federated learning (FL) has shown remarkable success in cooperatively
training deep models, while typically struggling with noisy labels. Advanced
works propose to tackle label noise by a re-weighting strategy with a strong
assumption, i.e., mild label noise. However, it may be violated in many
real-world FL scenarios because of highly contaminated clients, resulting in
extreme noise ratios, e.g., >90%. To tackle extremely noisy clients, we study
the robustness of the re-weighting strategy, showing a pessimistic conclusion:
minimizing the weight of clients trained over noisy data outperforms
re-weighting strategies. To leverage models trained on noisy clients, we
propose a novel approach, called negative distillation (FedNed). FedNed first
identifies noisy clients and employs rather than discards the noisy clients in
a knowledge distillation manner. In particular, clients identified as noisy
ones are required to train models using noisy labels and pseudo-labels obtained
by global models. The model trained on noisy labels serves as a `bad teacher'
in knowledge distillation, aiming to decrease the risk of providing incorrect
information. Meanwhile, the model trained on pseudo-labels is involved in model
aggregation if not identified as a noisy client. Consequently, through
pseudo-labeling, FedNed gradually increases the trustworthiness of models
trained on noisy clients, while leveraging all clients for model aggregation
through negative distillation. To verify the efficacy of FedNed, we conduct
extensive experiments under various settings, demonstrating that FedNed can
consistently outperform baselines and achieve state-of-the-art performance. Our
code is available at https://github.com/linChen99/FedNed.Comment: Accepted by AAAI 202