2,345 research outputs found

    Achieving Secure and Differentially Private Computations in Multiparty Settings

    Full text link
    Sharing and working on sensitive data in distributed settings from healthcare to finance is a major challenge due to security and privacy concerns. Secure multiparty computation (SMC) is a viable panacea for this, allowing distributed parties to make computations while the parties learn nothing about their data, but the final result. Although SMC is instrumental in such distributed settings, it does not provide any guarantees not to leak any information about individuals to adversaries. Differential privacy (DP) can be utilized to address this; however, achieving SMC with DP is not a trivial task, either. In this paper, we propose a novel Secure Multiparty Distributed Differentially Private (SM-DDP) protocol to achieve secure and private computations in a multiparty environment. Specifically, with our protocol, we simultaneously achieve SMC and DP in distributed settings focusing on linear regression on horizontally distributed data. That is, parties do not see each others' data and further, can not infer information about individuals from the final constructed statistical model. Any statistical model function that allows independent calculation of local statistics can be computed through our protocol. The protocol implements homomorphic encryption for SMC and functional mechanism for DP to achieve the desired security and privacy guarantees. In this work, we first introduce the theoretical foundation for the SM-DDP protocol and then evaluate its efficacy and performance on two different datasets. Our results show that one can achieve individual-level privacy through the proposed protocol with distributed DP, which is independently applied by each party in a distributed fashion. Moreover, our results also show that the SM-DDP protocol incurs minimal computational overhead, is scalable, and provides security and privacy guarantees

    PrivMin: Differentially Private MinHash for Jaccard Similarity Computation

    Full text link
    In many industrial applications of big data, the Jaccard Similarity Computation has been widely used to measure the distance between two profiles or sets respectively owned by two users. Yet, one semi-honest user with unpredictable knowledge may also deduce the private or sensitive information (e.g., the existence of a single element in the original sets) of the other user via the shared similarity. In this paper, we aim at solving the privacy issues in Jaccard similarity computation with strict differential privacy guarantees. To achieve this, we first define the Conditional ϵ\epsilon-DPSO, a relaxed differential privacy definition regarding set operations, and prove that the MinHash-based Jaccard Similarity Computation (MH-JSC) satisfies this definition. Then for achieving strict differential privacy in MH-JSC, we propose the PrivMin algorithm, which consists of two private operations: 1) the Private MinHash Value Generation that works by introducing the Exponential noise to the generation of MinHash signature. 2) the Randomized MinHashing Steps Selection that works by adopting Randomized Response technique to privately select several steps within the MinHashing phase that are deployed with the Exponential mechanism. Experiments on real datasets demonstrate that the proposed PrivMin algorithm can successfully retain the utility of the computed similarity while preserving privacy.Comment: 27 pages, 6 figures, 4 table

    Privacy Preserving Face Recognition Utilizing Differential Privacy

    Full text link
    Facial recognition technologies are implemented in many areas, including but not limited to, citizen surveillance, crime control, activity monitoring, and facial expression evaluation. However, processing biometric information is a resource-intensive task that often involves third-party servers, which can be accessed by adversaries with malicious intent. Biometric information delivered to untrusted third-party servers in an uncontrolled manner can be considered a significant privacy leak (i.e. uncontrolled information release) as biometrics can be correlated with sensitive data such as healthcare or financial records. In this paper, we propose a privacy-preserving technique for "controlled information release", where we disguise an original face image and prevent leakage of the biometric features while identifying a person. We introduce a new privacy-preserving face recognition protocol named PEEP (Privacy using EigEnface Perturbation) that utilizes local differential privacy. PEEP applies perturbation to Eigenfaces utilizing differential privacy and stores only the perturbed data in the third-party servers to run a standard Eigenface recognition algorithm. As a result, the trained model will not be vulnerable to privacy attacks such as membership inference and model memorization attacks. Our experiments show that PEEP exhibits a classification accuracy of around 70% - 90% under standard privacy settings

    Privacy in Deep Learning: A Survey

    Full text link
    The ever-growing advances of deep learning in many areas including vision, recommendation systems, natural language processing, etc., have led to the adoption of Deep Neural Networks (DNNs) in production systems. The availability of large datasets and high computational power are the main contributors to these advances. The datasets are usually crowdsourced and may contain sensitive information. This poses serious privacy concerns as this data can be misused or leaked through various vulnerabilities. Even if the cloud provider and the communication link is trusted, there are still threats of inference attacks where an attacker could speculate properties of the data used for training, or find the underlying model architecture and parameters. In this survey, we review the privacy concerns brought by deep learning, and the mitigating techniques introduced to tackle these issues. We also show that there is a gap in the literature regarding test-time inference privacy, and propose possible future research directions

    Inherit Differential Privacy in Distributed Setting: Multiparty Randomized Function Computation

    Full text link
    How to achieve differential privacy in the distributed setting, where the dataset is distributed among the distrustful parties, is an important problem. We consider in what condition can a protocol inherit the differential privacy property of a function it computes. The heart of the problem is the secure multiparty computation of randomized function. A notion \emph{obliviousness} is introduced, which captures the key security problems when computing a randomized function from a deterministic one in the distributed setting. By this observation, a sufficient and necessary condition about computing a randomized function from a deterministic one is given. The above result can not only be used to determine whether a protocol computing differentially private function is secure, but also be used to construct secure one. Then we prove that the differential privacy property of a function can be inherited by the protocol computing it if the protocol privately computes it. A composition theorem of differentially private protocols is also presented. We also construct some protocols to generate random variate in the distributed setting, such as the uniform random variates and the inversion method. By using these fundamental protocols, we construct protocols of the Gaussian mechanism, the Laplace mechanism and the Exponential mechanism. Importantly, all these protocols satisfy obliviousness and so can be proved to be secure in a simulation based manner. We also provide a complexity bound of computing randomized function in the distribute setting. Finally, to show that our results are fundamental and powerful to multiparty differential privacy, we construct a differentially private empirical risk minimization protocol

    Privacy Preserving Record Linkage via grams Projections

    Full text link
    Record linkage has been extensively used in various data mining applications involving sharing data. While the amount of available data is growing, the concern of disclosing sensitive information poses the problem of utility vs privacy. In this paper, we study the problem of private record linkage via secure data transformations. In contrast to the existing techniques in this area, we propose a novel approach that provides strong privacy guarantees under the formal framework of differential privacy. We develop an embedding strategy based on frequent variable length grams mined in a private way from the original data. We also introduce personalized threshold for matching individual records in the embedded space which achieves better linkage accuracy than the existing global threshold approach. Compared with the state-of-the-art secure matching schema, our approach provides formal, provable privacy guarantees and achieves better scalability while providing comparable utility

    Differential Privacy Techniques for Cyber Physical Systems: A Survey

    Full text link
    Modern cyber physical systems (CPSs) has widely being used in our daily lives because of development of information and communication technologies (ICT).With the provision of CPSs, the security and privacy threats associated to these systems are also increasing. Passive attacks are being used by intruders to get access to private information of CPSs. In order to make CPSs data more secure, certain privacy preservation strategies such as encryption, and k-anonymity have been presented in the past. However, with the advances in CPSs architecture, these techniques also needs certain modifications. Meanwhile, differential privacy emerged as an efficient technique to protect CPSs data privacy. In this paper, we present a comprehensive survey of differential privacy techniques for CPSs. In particular, we survey the application and implementation of differential privacy in four major applications of CPSs named as energy systems, transportation systems, healthcare and medical systems, and industrial Internet of things (IIoT). Furthermore, we present open issues, challenges, and future research direction for differential privacy techniques for CPSs. This survey can serve as basis for the development of modern differential privacy techniques to address various problems and data privacy scenarios of CPSs.Comment: 46 pages, 12 figure

    Privacy-Preserving Multiparty Learning For Logistic Regression

    Full text link
    In recent years, machine learning techniques are widely used in numerous applications, such as weather forecast, financial data analysis, spam filtering, and medical prediction. In the meantime, massive data generated from multiple sources further improve the performance of machine learning tools. However, data sharing from multiple sources brings privacy issues for those sources since sensitive information may be leaked in this process. In this paper, we propose a framework enabling multiple parties to collaboratively and accurately train a learning model over distributed datasets while guaranteeing the privacy of data sources. Specifically, we consider logistic regression model for data training and propose two approaches for perturbing the objective function to preserve {\epsilon}-differential privacy. The proposed solutions are tested on real datasets, including Bank Marketing and Credit Card Default prediction. Experimental results demonstrate that the proposed multiparty learning framework is highly efficient and accurate.Comment: This work was done when Wei Du was at the University of Arkansa

    Computational Differential Privacy from Lattice-based Cryptography

    Full text link
    The emerging technologies for large scale data analysis raise new challenges to the security and privacy of sensitive user data. In this work we investigate the problem of private statistical analysis of time-series data in the distributed and semi-honest setting. In particular, we study some properties of Private Stream Aggregation (PSA), first introduced by Shi et al. 2017. This is a computationally secure protocol for the collection and aggregation of data in a distributed network and has a very small communication cost. In the non-adaptive query model, a secure PSA scheme can be built upon any key-homomorphic weak pseudo-random function as shown by Valovich 2017, yielding security guarantees in the standard model which is in contrast to Shi et. al. We show that every mechanism which preserves (ϵ,δ)(\epsilon,\delta)-differential privacy in effect preserves computational (ϵ,δ)(\epsilon,\delta)-differential privacy when it is executed through a secure PSA scheme. Furthermore, we introduce a novel perturbation mechanism based on the symmetric Skellam distribution that is suited for preserving differential privacy in the distributed setting, and find that its performances in terms of privacy and accuracy are comparable to those of previous solutions. On the other hand, we leverage its specific properties to construct a computationally efficient prospective post-quantum protocol for differentially private time-series data analysis in the distributed model. The security of this protocol is based on the hardness of a new variant of the Decisional Learning with Errors (DLWE) problem. In this variant the errors are taken from the symmetric Skellam distribution. We show that this new variant is hard based on the hardness of the standard Learning with Errors (LWE) problem where the errors are taken from the discrete Gaussian distribution. Thus, we provide a variant of the LWE problem that is hard...Comment: arXiv admin note: substantial text overlap with arXiv:1507.0807

    Private Stream Aggregation Revisited

    Full text link
    In this work, we investigate the problem of private statistical analysis in the distributed and semi-honest setting. In particular, we study properties of Private Stream Aggregation schemes, first introduced by Shi et al. \cite{2}. These are computationally secure protocols for the aggregation of data in a network and have a very small communication cost. We show that such schemes can be built upon any key-homomorphic \textit{weak} pseudo-random function. Thus, in contrast to the aforementioned work, our security definition can be achieved in the \textit{standard model}. In addition, we give a computationally efficient instantiation of this protocol based on the Decisional Diffie-Hellman problem. Moreover, we show that every mechanism which preserves (ϵ,δ)(\epsilon,\delta)-differential privacy provides \textit{computational} (ϵ,δ)(\epsilon,\delta)-differential privacy when it is executed through a Private Stream Aggregation scheme. Finally, we introduce a novel perturbation mechanism based on the \textit{Skellam distribution} that is suited for the distributed setting, and compare its performances with those of previous solutions.Comment: 33 pages, 2 tables, 1 figur