1,001 research outputs found

    Individual Privacy vs Population Privacy: Learning to Attack Anonymization

    Full text link
    Over the last decade there have been great strides made in developing techniques to compute functions privately. In particular, Differential Privacy gives strong promises about conclusions that can be drawn about an individual. In contrast, various syntactic methods for providing privacy (criteria such as kanonymity and l-diversity) have been criticized for still allowing private information of an individual to be inferred. In this report, we consider the ability of an attacker to use data meeting privacy definitions to build an accurate classifier. We demonstrate that even under Differential Privacy, such classifiers can be used to accurately infer "private" attributes in realistic data. We compare this to similar approaches for inferencebased attacks on other forms of anonymized data. We place these attacks on the same scale, and observe that the accuracy of inference of private attributes for Differentially Private data and l-diverse data can be quite similar

    Practical differential privacy in high dimensions

    Get PDF
    Privacy-preserving, and more concretely differentially private machine learning, is concerned with hiding specific details in training datasets which contain sensitive information. Many proposed differentially private machine learning algorithms have promising theoretical properties, such as convergence to non-private performance in the limit of infinite data, computational efficiency, and polynomial sample complexity. Unfortunately, these properties have not always translated to real-world applications of private machine learning methods, which is why their adoption by practitioners has been slow. For many typical problems and sample sizes classification accuracy has been unsatisfactory. Through feature selection which preserves end-to-end privacy, this work has demonstrated that private machine learning algorithms can indeed be useful in practice. In particular, we propose a new feature selection mechanism, which fits well with the design constraints imposed by differential privacy, and allows for improved scalability of private classifiers in realistic settings. We investigate differentially private Naive Bayes and Logistic Regression and show non-trivial performance on a number of datasets. Significant empirical evidence suggests that the number of features and number of hyperparameters can be determining factors of the performance of differentially private classifiers

    A Survey on Differential Privacy with Machine Learning and Future Outlook

    Full text link
    Nowadays, machine learning models and applications have become increasingly pervasive. With this rapid increase in the development and employment of machine learning models, a concern regarding privacy has risen. Thus, there is a legitimate need to protect the data from leaking and from any attacks. One of the strongest and most prevalent privacy models that can be used to protect machine learning models from any attacks and vulnerabilities is differential privacy (DP). DP is strict and rigid definition of privacy, where it can guarantee that an adversary is not capable to reliably predict if a specific participant is included in the dataset or not. It works by injecting a noise to the data whether to the inputs, the outputs, the ground truth labels, the objective functions, or even to the gradients to alleviate the privacy issue and protect the data. To this end, this survey paper presents different differentially private machine learning algorithms categorized into two main categories (traditional machine learning models vs. deep learning models). Moreover, future research directions for differential privacy with machine learning algorithms are outlined.Comment: 12 pages, 3 figure

    On the Differential Privacy of Bayesian Inference

    Get PDF
    We study how to communicate findings of Bayesian inference to third parties, while preserving the strong guarantee of differential privacy. Our main contributions are four different algorithms for private Bayesian inference on proba-bilistic graphical models. These include two mechanisms for adding noise to the Bayesian updates, either directly to the posterior parameters, or to their Fourier transform so as to preserve update consistency. We also utilise a recently introduced posterior sampling mechanism, for which we prove bounds for the specific but general case of discrete Bayesian networks; and we introduce a maximum-a-posteriori private mechanism. Our analysis includes utility and privacy bounds, with a novel focus on the influence of graph structure on privacy. Worked examples and experiments with Bayesian na{\"i}ve Bayes and Bayesian linear regression illustrate the application of our mechanisms.Comment: AAAI 2016, Feb 2016, Phoenix, Arizona, United State
    corecore