45 research outputs found

    Notes on Information-Theoretic Privacy

    Full text link
    We investigate the tradeoff between privacy and utility in a situation where both privacy and utility are measured in terms of mutual information. For the binary case, we fully characterize this tradeoff in case of perfect privacy and also give an upper-bound for the case where some privacy leakage is allowed. We then introduce a new quantity which quantifies the amount of private information contained in the observable data and then connect it to the optimal tradeoff between privacy and utility.Comment: The corrected version of a paper appeared in Allerton 201

    Measuring privacy leakage in term of Shannon entropy

    Get PDF
    Differential privacy is a privacy scheme in which a database is modified such that each users personal data are protected without affecting significantly the characteristics of the whole data. Example of such mechanism is Randomized Aggregatable Privacy-Preserving Ordinal Response (RAPPOR). Later it is found that the interpretations of privacy, accuracy and utility parameters in differential privacy are not totally clear. Therefore in this article an alternative definition of privacy aspect are proposed, where they are measured in term of Shannon entropy. Here Shannon entropy can be interpreted as number of binary questions an aggregator needs to ask in order to learn information from a modified database. Then privacy leakage of a differentially private mechanism is defined as mutual information between original distribution of an attribute in a database and its modified version. Furthermore, some simulations using the MATLAB software for special cases in RAPPOR are also presented to show that this alternative definition does make sense

    Consideration of Data Security and Privacy Using Machine Learning Techniques

    Get PDF
    As artificial intelligence becomes more and more prevalent, machine learning algorithms are being used in a wider range of domains. Big data and processing power, which are typically gathered via crowdsourcing and acquired online, are essential for the effectiveness of machine learning. Sensitive and private data, such as ID numbers, personal mobile phone numbers, and medical records, are frequently included in the data acquired for machine learning training. A significant issue is how to effectively and cheaply protect sensitive private data. With this type of issue in mind, this article first discusses the privacy dilemma in machine learning and how it might be exploited before summarizing the features and techniques for protecting privacy in machine learning algorithms. Next, the combination of a network of convolutional neural networks and a different secure privacy approach is suggested to improve the accuracy of classification of the various algorithms that employ noise to safeguard privacy. This approach can acquire each layer's privacy budget of a neural network and completely incorporates the properties of Gaussian distribution and difference. Lastly, the Gaussian noise scale is set, and the sensitive information in the data is preserved by using the gradient value of a stochastic gradient descent technique. The experimental results showed that a balance of better accuracy of 99.05% between the accessibility and privacy protection of the training data set could be achieved by modifying the depth differential privacy model's parameters depending on variations in private information in the data
    corecore