3 research outputs found
Share Your Representation Only: Guaranteed Improvement of the Privacy-Utility Tradeoff in Federated Learning
Repeated parameter sharing in federated learning causes significant
information leakage about private data, thus defeating its main purpose: data
privacy. Mitigating the risk of this information leakage, using state of the
art differentially private algorithms, also does not come for free. Randomized
mechanisms can prevent convergence of models on learning even the useful
representation functions, especially if there is more disagreement between
local models on the classification functions (due to data heterogeneity). In
this paper, we consider a representation federated learning objective that
encourages various parties to collaboratively refine the consensus part of the
model, with differential privacy guarantees, while separately allowing
sufficient freedom for local personalization (without releasing it). We prove
that in the linear representation setting, while the objective is non-convex,
our proposed new algorithm \DPFEDREP\ converges to a ball centered around the
\emph{global optimal} solution at a linear rate, and the radius of the ball is
proportional to the reciprocal of the privacy budget. With this novel utility
analysis, we improve the SOTA utility-privacy trade-off for this problem by a
factor of , where is the input dimension. We empirically evaluate
our method with the image classification task on CIFAR10, CIFAR100, and EMNIST,
and observe a significant performance improvement over the prior work under the
same small privacy budget. The code can be found in this link:
https://github.com/shenzebang/CENTAUR-Privacy-Federated-Representation-Learning.Comment: ICLR 2023 revise
Differentially private multi-task learning
Privacy restrictions of sensitive data repositories imply that the data analysis is performed in isolation at each data source. A prime example is the isolated nature of building prognosis models from hospital data and the associated challenge of dealing with small number of samples in risk classes (e.g. suicide) while doing so. Pooling knowledge from other hospitals, through multi-task learning, can alleviate this problem. However, if knowledge is to be shared unrestricted, privacy is breached. Addressing this, we propose a novel multi-task learning method that preserves privacy of data under the strong guarantees of differential privacy. Further, we develop a novel attribute-wise noise addition scheme that significantly lifts the utility of the proposed method. We demonstrate the effectiveness of our method with a synthetic and two real datasets