15 research outputs found
Differentially Private Federated Clustering over Non-IID Data
In this paper, we investigate federated clustering (FedC) problem, that aims
to accurately partition unlabeled data samples distributed over massive clients
into finite clusters under the orchestration of a parameter server, meanwhile
considering data privacy. Though it is an NP-hard optimization problem
involving real variables denoting cluster centroids and binary variables
denoting the cluster membership of each data sample, we judiciously reformulate
the FedC problem into a non-convex optimization problem with only one convex
constraint, accordingly yielding a soft clustering solution. Then a novel FedC
algorithm using differential privacy (DP) technique, referred to as DP-FedC, is
proposed in which partial clients participation and multiple local model
updating steps are also considered. Furthermore, various attributes of the
proposed DP-FedC are obtained through theoretical analyses of privacy
protection and convergence rate, especially for the case of non-identically and
independently distributed (non-i.i.d.) data, that ideally serve as the
guidelines for the design of the proposed DP-FedC. Then some experimental
results on two real datasets are provided to demonstrate the efficacy of the
proposed DP-FedC together with its much superior performance over some
state-of-the-art FedC algorithms, and the consistency with all the presented
analytical results.Comment: 31 pages, 4 figures, 1 tabl