2,345 research outputs found
Achieving Secure and Differentially Private Computations in Multiparty Settings
Sharing and working on sensitive data in distributed settings from healthcare
to finance is a major challenge due to security and privacy concerns. Secure
multiparty computation (SMC) is a viable panacea for this, allowing distributed
parties to make computations while the parties learn nothing about their data,
but the final result. Although SMC is instrumental in such distributed
settings, it does not provide any guarantees not to leak any information about
individuals to adversaries. Differential privacy (DP) can be utilized to
address this; however, achieving SMC with DP is not a trivial task, either. In
this paper, we propose a novel Secure Multiparty Distributed Differentially
Private (SM-DDP) protocol to achieve secure and private computations in a
multiparty environment. Specifically, with our protocol, we simultaneously
achieve SMC and DP in distributed settings focusing on linear regression on
horizontally distributed data. That is, parties do not see each others' data
and further, can not infer information about individuals from the final
constructed statistical model. Any statistical model function that allows
independent calculation of local statistics can be computed through our
protocol. The protocol implements homomorphic encryption for SMC and functional
mechanism for DP to achieve the desired security and privacy guarantees. In
this work, we first introduce the theoretical foundation for the SM-DDP
protocol and then evaluate its efficacy and performance on two different
datasets. Our results show that one can achieve individual-level privacy
through the proposed protocol with distributed DP, which is independently
applied by each party in a distributed fashion. Moreover, our results also show
that the SM-DDP protocol incurs minimal computational overhead, is scalable,
and provides security and privacy guarantees
PrivMin: Differentially Private MinHash for Jaccard Similarity Computation
In many industrial applications of big data, the Jaccard Similarity
Computation has been widely used to measure the distance between two profiles
or sets respectively owned by two users. Yet, one semi-honest user with
unpredictable knowledge may also deduce the private or sensitive information
(e.g., the existence of a single element in the original sets) of the other
user via the shared similarity. In this paper, we aim at solving the privacy
issues in Jaccard similarity computation with strict differential privacy
guarantees. To achieve this, we first define the Conditional -DPSO, a
relaxed differential privacy definition regarding set operations, and prove
that the MinHash-based Jaccard Similarity Computation (MH-JSC) satisfies this
definition. Then for achieving strict differential privacy in MH-JSC, we
propose the PrivMin algorithm, which consists of two private operations: 1) the
Private MinHash Value Generation that works by introducing the Exponential
noise to the generation of MinHash signature. 2) the Randomized MinHashing
Steps Selection that works by adopting Randomized Response technique to
privately select several steps within the MinHashing phase that are deployed
with the Exponential mechanism. Experiments on real datasets demonstrate that
the proposed PrivMin algorithm can successfully retain the utility of the
computed similarity while preserving privacy.Comment: 27 pages, 6 figures, 4 table
Privacy Preserving Face Recognition Utilizing Differential Privacy
Facial recognition technologies are implemented in many areas, including but
not limited to, citizen surveillance, crime control, activity monitoring, and
facial expression evaluation. However, processing biometric information is a
resource-intensive task that often involves third-party servers, which can be
accessed by adversaries with malicious intent. Biometric information delivered
to untrusted third-party servers in an uncontrolled manner can be considered a
significant privacy leak (i.e. uncontrolled information release) as biometrics
can be correlated with sensitive data such as healthcare or financial records.
In this paper, we propose a privacy-preserving technique for "controlled
information release", where we disguise an original face image and prevent
leakage of the biometric features while identifying a person. We introduce a
new privacy-preserving face recognition protocol named PEEP (Privacy using
EigEnface Perturbation) that utilizes local differential privacy. PEEP applies
perturbation to Eigenfaces utilizing differential privacy and stores only the
perturbed data in the third-party servers to run a standard Eigenface
recognition algorithm. As a result, the trained model will not be vulnerable to
privacy attacks such as membership inference and model memorization attacks.
Our experiments show that PEEP exhibits a classification accuracy of around 70%
- 90% under standard privacy settings
Privacy in Deep Learning: A Survey
The ever-growing advances of deep learning in many areas including vision,
recommendation systems, natural language processing, etc., have led to the
adoption of Deep Neural Networks (DNNs) in production systems. The availability
of large datasets and high computational power are the main contributors to
these advances. The datasets are usually crowdsourced and may contain sensitive
information. This poses serious privacy concerns as this data can be misused or
leaked through various vulnerabilities. Even if the cloud provider and the
communication link is trusted, there are still threats of inference attacks
where an attacker could speculate properties of the data used for training, or
find the underlying model architecture and parameters. In this survey, we
review the privacy concerns brought by deep learning, and the mitigating
techniques introduced to tackle these issues. We also show that there is a gap
in the literature regarding test-time inference privacy, and propose possible
future research directions
Inherit Differential Privacy in Distributed Setting: Multiparty Randomized Function Computation
How to achieve differential privacy in the distributed setting, where the
dataset is distributed among the distrustful parties, is an important problem.
We consider in what condition can a protocol inherit the differential privacy
property of a function it computes. The heart of the problem is the secure
multiparty computation of randomized function. A notion \emph{obliviousness} is
introduced, which captures the key security problems when computing a
randomized function from a deterministic one in the distributed setting. By
this observation, a sufficient and necessary condition about computing a
randomized function from a deterministic one is given. The above result can not
only be used to determine whether a protocol computing differentially private
function is secure, but also be used to construct secure one. Then we prove
that the differential privacy property of a function can be inherited by the
protocol computing it if the protocol privately computes it. A composition
theorem of differentially private protocols is also presented. We also
construct some protocols to generate random variate in the distributed setting,
such as the uniform random variates and the inversion method. By using these
fundamental protocols, we construct protocols of the Gaussian mechanism, the
Laplace mechanism and the Exponential mechanism. Importantly, all these
protocols satisfy obliviousness and so can be proved to be secure in a
simulation based manner. We also provide a complexity bound of computing
randomized function in the distribute setting. Finally, to show that our
results are fundamental and powerful to multiparty differential privacy, we
construct a differentially private empirical risk minimization protocol
Privacy Preserving Record Linkage via grams Projections
Record linkage has been extensively used in various data mining applications
involving sharing data. While the amount of available data is growing, the
concern of disclosing sensitive information poses the problem of utility vs
privacy. In this paper, we study the problem of private record linkage via
secure data transformations. In contrast to the existing techniques in this
area, we propose a novel approach that provides strong privacy guarantees under
the formal framework of differential privacy. We develop an embedding strategy
based on frequent variable length grams mined in a private way from the
original data. We also introduce personalized threshold for matching individual
records in the embedded space which achieves better linkage accuracy than the
existing global threshold approach. Compared with the state-of-the-art secure
matching schema, our approach provides formal, provable privacy guarantees and
achieves better scalability while providing comparable utility
Differential Privacy Techniques for Cyber Physical Systems: A Survey
Modern cyber physical systems (CPSs) has widely being used in our daily lives
because of development of information and communication technologies (ICT).With
the provision of CPSs, the security and privacy threats associated to these
systems are also increasing. Passive attacks are being used by intruders to get
access to private information of CPSs. In order to make CPSs data more secure,
certain privacy preservation strategies such as encryption, and k-anonymity
have been presented in the past. However, with the advances in CPSs
architecture, these techniques also needs certain modifications. Meanwhile,
differential privacy emerged as an efficient technique to protect CPSs data
privacy. In this paper, we present a comprehensive survey of differential
privacy techniques for CPSs. In particular, we survey the application and
implementation of differential privacy in four major applications of CPSs named
as energy systems, transportation systems, healthcare and medical systems, and
industrial Internet of things (IIoT). Furthermore, we present open issues,
challenges, and future research direction for differential privacy techniques
for CPSs. This survey can serve as basis for the development of modern
differential privacy techniques to address various problems and data privacy
scenarios of CPSs.Comment: 46 pages, 12 figure
Privacy-Preserving Multiparty Learning For Logistic Regression
In recent years, machine learning techniques are widely used in numerous
applications, such as weather forecast, financial data analysis, spam
filtering, and medical prediction. In the meantime, massive data generated from
multiple sources further improve the performance of machine learning tools.
However, data sharing from multiple sources brings privacy issues for those
sources since sensitive information may be leaked in this process. In this
paper, we propose a framework enabling multiple parties to collaboratively and
accurately train a learning model over distributed datasets while guaranteeing
the privacy of data sources. Specifically, we consider logistic regression
model for data training and propose two approaches for perturbing the objective
function to preserve {\epsilon}-differential privacy. The proposed solutions
are tested on real datasets, including Bank Marketing and Credit Card Default
prediction. Experimental results demonstrate that the proposed multiparty
learning framework is highly efficient and accurate.Comment: This work was done when Wei Du was at the University of Arkansa
Computational Differential Privacy from Lattice-based Cryptography
The emerging technologies for large scale data analysis raise new challenges
to the security and privacy of sensitive user data. In this work we investigate
the problem of private statistical analysis of time-series data in the
distributed and semi-honest setting. In particular, we study some properties of
Private Stream Aggregation (PSA), first introduced by Shi et al. 2017. This is
a computationally secure protocol for the collection and aggregation of data in
a distributed network and has a very small communication cost. In the
non-adaptive query model, a secure PSA scheme can be built upon any
key-homomorphic weak pseudo-random function as shown by Valovich 2017, yielding
security guarantees in the standard model which is in contrast to Shi et. al.
We show that every mechanism which preserves -differential
privacy in effect preserves computational -differential
privacy when it is executed through a secure PSA scheme. Furthermore, we
introduce a novel perturbation mechanism based on the symmetric Skellam
distribution that is suited for preserving differential privacy in the
distributed setting, and find that its performances in terms of privacy and
accuracy are comparable to those of previous solutions. On the other hand, we
leverage its specific properties to construct a computationally efficient
prospective post-quantum protocol for differentially private time-series data
analysis in the distributed model. The security of this protocol is based on
the hardness of a new variant of the Decisional Learning with Errors (DLWE)
problem. In this variant the errors are taken from the symmetric Skellam
distribution. We show that this new variant is hard based on the hardness of
the standard Learning with Errors (LWE) problem where the errors are taken from
the discrete Gaussian distribution. Thus, we provide a variant of the LWE
problem that is hard...Comment: arXiv admin note: substantial text overlap with arXiv:1507.0807
Private Stream Aggregation Revisited
In this work, we investigate the problem of private statistical analysis in
the distributed and semi-honest setting. In particular, we study properties of
Private Stream Aggregation schemes, first introduced by Shi et al. \cite{2}.
These are computationally secure protocols for the aggregation of data in a
network and have a very small communication cost. We show that such schemes can
be built upon any key-homomorphic \textit{weak} pseudo-random function. Thus,
in contrast to the aforementioned work, our security definition can be achieved
in the \textit{standard model}. In addition, we give a computationally
efficient instantiation of this protocol based on the Decisional Diffie-Hellman
problem. Moreover, we show that every mechanism which preserves
-differential privacy provides \textit{computational}
-differential privacy when it is executed through a Private
Stream Aggregation scheme. Finally, we introduce a novel perturbation mechanism
based on the \textit{Skellam distribution} that is suited for the distributed
setting, and compare its performances with those of previous solutions.Comment: 33 pages, 2 tables, 1 figur
- …