3 research outputs found
RDP-GAN: A R\'enyi-Differential Privacy based Generative Adversarial Network
Generative adversarial network (GAN) has attracted increasing attention
recently owing to its impressive ability to generate realistic samples with
high privacy protection. Without directly interactive with training examples,
the generative model can be fully used to estimate the underlying distribution
of an original dataset while the discriminative model can examine the quality
of the generated samples by comparing the label values with the training
examples. However, when GANs are applied on sensitive or private training
examples, such as medical or financial records, it is still probable to divulge
individuals' sensitive and private information. To mitigate this information
leakage and construct a private GAN, in this work we propose a
R\'enyi-differentially private-GAN (RDP-GAN), which achieves differential
privacy (DP) in a GAN by carefully adding random noises on the value of the
loss function during training. Moreover, we derive the analytical results of
the total privacy loss under the subsampling method and cumulated iterations,
which show its effectiveness on the privacy budget allocation. In addition, in
order to mitigate the negative impact brought by the injecting noise, we
enhance the proposed algorithm by adding an adaptive noise tuning step, which
will change the volume of added noise according to the testing accuracy.
Through extensive experimental results, we verify that the proposed algorithm
can achieve a better privacy level while producing high-quality samples
compared with a benchmark DP-GAN scheme based on noise perturbation on training
gradients
User-Level Privacy-Preserving Federated Learning: Analysis and Performance Optimization
Federated learning (FL), as a type of collaborative machine learning
framework, is capable of preserving private data from mobile terminals (MTs)
while training the data into useful models. Nevertheless, from a viewpoint of
information theory, it is still possible for a curious server to infer private
information from the shared models uploaded by MTs. To address this problem, we
first make use of the concept of local differential privacy (LDP), and propose
a user-level differential privacy (UDP) algorithm by adding artificial noise to
the shared models before uploading them to servers. According to our analysis,
the UDP framework can realize -LDP for the -th
MT with adjustable privacy protection levels by varying the variances of the
artificial noise processes. We then derive a theoretical convergence
upper-bound for the UDP algorithm. It reveals that there exists an optimal
number of communication rounds to achieve the best learning performance. More
importantly, we propose a communication rounds discounting (CRD) method.
Compared with the heuristic search method, the proposed CRD method can achieve
a much better trade-off between the computational complexity of searching and
the convergence performance. Extensive experiments indicate that our UDP
algorithm using the proposed CRD method can effectively improve both the
training efficiency and model quality for the given privacy protection levels
Learning while Respecting Privacy and Robustness to Distributional Uncertainties and Adversarial Data
Data used to train machine learning models can be adversarial--maliciously
constructed by adversaries to fool the model. Challenge also arises by privacy,
confidentiality, or due to legal constraints when data are geographically
gathered and stored across multiple learners, some of which may hold even an
"anonymized" or unreliable dataset. In this context, the distributionally
robust optimization framework is considered for training a parametric model,
both in centralized and federated learning settings. The objective is to endow
the trained model with robustness against adversarially manipulated input data,
or, distributional uncertainties, such as mismatches between training and
testing data distributions, or among datasets stored at different workers. To
this aim, the data distribution is assumed unknown, and lies within a
Wasserstein ball centered around the empirical data distribution. This robust
learning task entails an infinite-dimensional optimization problem, which is
challenging. Leveraging a strong duality result, a surrogate is obtained, for
which three stochastic primal-dual algorithms are developed: i) stochastic
proximal gradient descent with an -accurate oracle, which invokes an
oracle to solve the convex sub-problems; ii) stochastic proximal gradient
descent-ascent, which approximates the solution of the convex sub-problems via
a single gradient ascent step; and, iii) a distributionally robust federated
learning algorithm, which solves the sub-problems locally at different workers
where data are stored. Compared to the empirical risk minimization and
federated learning methods, the proposed algorithms offer robustness with
little computation overhead. Numerical tests using image datasets showcase the
merits of the proposed algorithms under several existing adversarial attacks
and distributional uncertainties.Comment: 14 pages, 5 figure