1,001 research outputs found
Individual Privacy vs Population Privacy: Learning to Attack Anonymization
Over the last decade there have been great strides made in developing
techniques to compute functions privately. In particular, Differential Privacy
gives strong promises about conclusions that can be drawn about an individual.
In contrast, various syntactic methods for providing privacy (criteria such as
kanonymity and l-diversity) have been criticized for still allowing private
information of an individual to be inferred. In this report, we consider the
ability of an attacker to use data meeting privacy definitions to build an
accurate classifier. We demonstrate that even under Differential Privacy, such
classifiers can be used to accurately infer "private" attributes in realistic
data. We compare this to similar approaches for inferencebased attacks on other
forms of anonymized data. We place these attacks on the same scale, and observe
that the accuracy of inference of private attributes for Differentially Private
data and l-diverse data can be quite similar
Practical differential privacy in high dimensions
Privacy-preserving, and more concretely differentially private machine learning, is
concerned with hiding specific details in training datasets which contain sensitive
information. Many proposed differentially private machine learning algorithms have
promising theoretical properties, such as convergence to non-private performance in
the limit of infinite data, computational efficiency, and polynomial sample complexity.
Unfortunately, these properties have not always translated to real-world applications
of private machine learning methods, which is why their adoption by practitioners has
been slow. For many typical problems and sample sizes classification accuracy has
been unsatisfactory. Through feature selection which preserves end-to-end privacy, this
work has demonstrated that private machine learning algorithms can indeed be useful in
practice. In particular, we propose a new feature selection mechanism, which fits well
with the design constraints imposed by differential privacy, and allows for improved
scalability of private classifiers in realistic settings. We investigate differentially private
Naive Bayes and Logistic Regression and show non-trivial performance on a number of
datasets. Significant empirical evidence suggests that the number of features and number
of hyperparameters can be determining factors of the performance of differentially
private classifiers
A Survey on Differential Privacy with Machine Learning and Future Outlook
Nowadays, machine learning models and applications have become increasingly
pervasive. With this rapid increase in the development and employment of
machine learning models, a concern regarding privacy has risen. Thus, there is
a legitimate need to protect the data from leaking and from any attacks. One of
the strongest and most prevalent privacy models that can be used to protect
machine learning models from any attacks and vulnerabilities is differential
privacy (DP). DP is strict and rigid definition of privacy, where it can
guarantee that an adversary is not capable to reliably predict if a specific
participant is included in the dataset or not. It works by injecting a noise to
the data whether to the inputs, the outputs, the ground truth labels, the
objective functions, or even to the gradients to alleviate the privacy issue
and protect the data. To this end, this survey paper presents different
differentially private machine learning algorithms categorized into two main
categories (traditional machine learning models vs. deep learning models).
Moreover, future research directions for differential privacy with machine
learning algorithms are outlined.Comment: 12 pages, 3 figure
On the Differential Privacy of Bayesian Inference
We study how to communicate findings of Bayesian inference to third parties,
while preserving the strong guarantee of differential privacy. Our main
contributions are four different algorithms for private Bayesian inference on
proba-bilistic graphical models. These include two mechanisms for adding noise
to the Bayesian updates, either directly to the posterior parameters, or to
their Fourier transform so as to preserve update consistency. We also utilise a
recently introduced posterior sampling mechanism, for which we prove bounds for
the specific but general case of discrete Bayesian networks; and we introduce a
maximum-a-posteriori private mechanism. Our analysis includes utility and
privacy bounds, with a novel focus on the influence of graph structure on
privacy. Worked examples and experiments with Bayesian na{\"i}ve Bayes and
Bayesian linear regression illustrate the application of our mechanisms.Comment: AAAI 2016, Feb 2016, Phoenix, Arizona, United State
- …