8 research outputs found
Hypothesis Testing Interpretations and Renyi Differential Privacy
Differential privacy is a de facto standard in data privacy, with
applications in the public and private sectors. A way to explain differential
privacy, which is particularly appealing to statistician and social scientists
is by means of its statistical hypothesis testing interpretation. Informally,
one cannot effectively test whether a specific individual has contributed her
data by observing the output of a private mechanism---any test cannot have both
high significance and high power.
In this paper, we identify some conditions under which a privacy definition
given in terms of a statistical divergence satisfies a similar interpretation.
These conditions are useful to analyze the distinguishability power of
divergences and we use them to study the hypothesis testing interpretation of
some relaxations of differential privacy based on Renyi divergence. This
analysis also results in an improved conversion rule between these definitions
and differential privacy
Hypothesis testing interpretations and Renyi differential privacy
Differential privacy is a de facto standard in data privacy, with applications in the public and private sectors. A way to explain differential privacy, which is particularly appealing to statistician and social scientists, is by means of its statistical hypothesis testing interpretation. Informally, one cannot effectively test whether a specific individual has contributed her data by observing the output of a private mechanism—any test cannot have both high significance and high power.
In this paper, we identify some conditions under which a privacy definition given in terms of a statistical divergence satisfies a similar interpretation. These conditions are useful to analyze the distinguishability power of divergences and we use them to study the hypothesis testing interpretation of some relaxations of
differential privacy based on Rényi divergence. This analysis also results in an improved conversion rule between these definitions and differential privacy.https://arxiv.org/pdf/1905.09982.pd
Truncated Laplace and Gaussian mechanisms of RDP
The Laplace mechanism and the Gaussian mechanism are primary mechanisms in
differential privacy, widely applicable to many scenarios involving numerical
data. However, due to the infinite-range random variables they generate, the
Laplace and Gaussian mechanisms may return values that are semantically
impossible, such as negative numbers. To address this issue, we have designed
the truncated Laplace mechanism and Gaussian mechanism. For a given truncation
interval [a, b], the truncated Gaussian mechanism ensures the same Renyi
Differential Privacy (RDP) as the untruncated mechanism, regardless of the
values chosen for the truncation interval [a, b]. Similarly, the truncated
Laplace mechanism, for specified interval [a, b], maintains the same RDP as the
untruncated mechanism. We provide the RDP expressions for each of them. We
believe that our study can further enhance the utility of differential privacy
in specific applications
DPSUR: Accelerating Differentially Private Stochastic Gradient Descent Using Selective Update and Release
Machine learning models are known to memorize private data to reduce their
training loss, which can be inadvertently exploited by privacy attacks such as
model inversion and membership inference. To protect against these attacks,
differential privacy (DP) has become the de facto standard for
privacy-preserving machine learning, particularly those popular training
algorithms using stochastic gradient descent, such as DPSGD. Nonetheless, DPSGD
still suffers from severe utility loss due to its slow convergence. This is
partially caused by the random sampling, which brings bias and variance to the
gradient, and partially by the Gaussian noise, which leads to fluctuation of
gradient updates.
Our key idea to address these issues is to apply selective updates to the
model training, while discarding those useless or even harmful updates.
Motivated by this, this paper proposes DPSUR, a Differentially Private training
framework based on Selective Updates and Release, where the gradient from each
iteration is evaluated based on a validation test, and only those updates
leading to convergence are applied to the model. As such, DPSUR ensures the
training in the right direction and thus can achieve faster convergence than
DPSGD. The main challenges lie in two aspects -- privacy concerns arising from
gradient evaluation, and gradient selection strategy for model update. To
address the challenges, DPSUR introduces a clipping strategy for update
randomization and a threshold mechanism for gradient selection. Experiments
conducted on MNIST, FMNIST, CIFAR-10, and IMDB datasets show that DPSUR
significantly outperforms previous works in terms of convergence speed and
model utility.Comment: This paper has been accepted by VLDB 202
Unraveling the Connections between Privacy and Certified Robustness in Federated Learning Against Poisoning Attacks
Federated learning (FL) provides an efficient paradigm to jointly train a
global model leveraging data from distributed users. As local training data
comes from different users who may not be trustworthy, several studies have
shown that FL is vulnerable to poisoning attacks. Meanwhile, to protect the
privacy of local users, FL is usually trained in a differentially private way
(DPFL). Thus, in this paper, we ask: What are the underlying connections
between differential privacy and certified robustness in FL against poisoning
attacks? Can we leverage the innate privacy property of DPFL to provide
certified robustness for FL? Can we further improve the privacy of FL to
improve such robustness certification? We first investigate both user-level and
instance-level privacy of FL and provide formal privacy analysis to achieve
improved instance-level privacy. We then provide two robustness certification
criteria: certified prediction and certified attack inefficacy for DPFL on both
user and instance levels. Theoretically, we provide the certified robustness of
DPFL based on both criteria given a bounded number of adversarial users or
instances. Empirically, we conduct extensive experiments to verify our theories
under a range of poisoning attacks on different datasets. We find that
increasing the level of privacy protection in DPFL results in stronger
certified attack inefficacy; however, it does not necessarily lead to a
stronger certified prediction. Thus, achieving the optimal certified prediction
requires a proper balance between privacy and utility loss.Comment: ACM CCS 202