2,344 research outputs found
Privacy-Preserving Federated Deep Clustering based on GAN
Federated clustering (FC) is an essential extension of centralized clustering
designed for the federated setting, wherein the challenge lies in constructing
a global similarity measure without the need to share private data.
Conventional approaches to FC typically adopt extensions of centralized
methods, like K-means and fuzzy c-means. However, these methods are susceptible
to non-independent-and-identically-distributed (non-IID) data among clients,
leading to suboptimal performance, particularly with high-dimensional data. In
this paper, we present a novel approach to address these limitations by
proposing a Privacy-Preserving Federated Deep Clustering based on Generative
Adversarial Networks (GANs). Each client trains a local generative adversarial
network (GAN) locally and uploads the synthetic data to the server. The server
applies a deep clustering network on the synthetic data to establish
cluster centroids, which are then downloaded to the clients for cluster
assignment. Theoretical analysis demonstrates that the GAN-generated samples,
shared among clients, inherently uphold certain privacy guarantees,
safeguarding the confidentiality of individual data. Furthermore, extensive
experimental evaluations showcase the effectiveness and utility of our proposed
method in achieving accurate and privacy-preserving federated clustering
Federated clustering with GAN-based data synthesis
Federated clustering (FC) is an extension of centralized clustering in
federated settings. The key here is how to construct a global similarity
measure without sharing private data, since the local similarity may be
insufficient to group local data correctly and the similarity of samples across
clients cannot be directly measured due to privacy constraints. Obviously, the
most straightforward way to analyze FC is to employ the methods extended from
centralized ones, such as K-means (KM) and fuzzy c-means (FCM). However, they
are vulnerable to non independent-and-identically-distributed (non-IID) data
among clients. To handle this, we propose a new federated clustering framework,
named synthetic data aided federated clustering (SDA-FC). It trains generative
adversarial network locally in each client and uploads the generated synthetic
data to the server, where KM or FCM is performed on the synthetic data. The
synthetic data can make the model immune to the non-IID problem and enable us
to capture the global similarity characteristics more effectively without
sharing private data. Comprehensive experiments reveals the advantages of
SDA-FC, including superior performance in addressing the non-IID problem and
the device failures
- …