743 research outputs found
Exploring Privacy Preservation in Outsourced K-Nearest Neighbors with Multiple Data Owners
The k-nearest neighbors (k-NN) algorithm is a popular and effective
classification algorithm. Due to its large storage and computational
requirements, it is suitable for cloud outsourcing. However, k-NN is often run
on sensitive data such as medical records, user images, or personal
information. It is important to protect the privacy of data in an outsourced
k-NN system.
Prior works have all assumed the data owners (who submit data to the
outsourced k-NN system) are a single trusted party. However, we observe that in
many practical scenarios, there may be multiple mutually distrusting data
owners. In this work, we present the first framing and exploration of privacy
preservation in an outsourced k-NN system with multiple data owners. We
consider the various threat models introduced by this modification. We discover
that under a particularly practical threat model that covers numerous
scenarios, there exists a set of adaptive attacks that breach the data privacy
of any exact k-NN system. The vulnerability is a result of the mathematical
properties of k-NN and its output. Thus, we propose a privacy-preserving
alternative system supporting kernel density estimation using a Gaussian
kernel, a classification algorithm from the same family as k-NN. In many
applications, this similar algorithm serves as a good substitute for k-NN. We
additionally investigate solutions for other threat models, often through
extensions on prior single data owner systems
Secure k-NN as a Service Over Encrypted Data in Multi-User Setting
To securely leverage the advantages of Cloud Computing, recently a lot of
research has happened in the area of "Secure Query Processing over Encrypted
Data". As a concrete use case, many encryption schemes have been proposed for
securely processing k Nearest Neighbors (SkNN) over encrypted data in the
outsourced setting. Recently Zhu et al[25]. proposed a SkNN solution which
claimed to satisfy following four properties: (1)Data Privacy, (2)Key
Confidentiality, (3)Query Privacy, and (4)Query Controllability. However, in
this paper, we present an attack which breaks the Query Controllability claim
of their scheme. Further, we propose a new SkNN solution which satisfies all
the four existing properties along with an additional essential property of
Query Check Verification. We analyze the security of our proposed scheme and
present the detailed experimental results to showcase the efficiency in real
world scenario
k-Nearest Neighbor Classification over Semantically Secure Encrypted Relational Data
Data Mining has wide applications in many areas such as banking, medicine,
scientific research and among government agencies. Classification is one of the
commonly used tasks in data mining applications. For the past decade, due to
the rise of various privacy issues, many theoretical and practical solutions to
the classification problem have been proposed under different security models.
However, with the recent popularity of cloud computing, users now have the
opportunity to outsource their data, in encrypted form, as well as the data
mining tasks to the cloud. Since the data on the cloud is in encrypted form,
existing privacy preserving classification techniques are not applicable. In
this paper, we focus on solving the classification problem over encrypted data.
In particular, we propose a secure k-NN classifier over encrypted data in the
cloud. The proposed k-NN protocol protects the confidentiality of the data,
user's input query, and data access patterns. To the best of our knowledge, our
work is the first to develop a secure k-NN classifier over encrypted data under
the semi-honest model. Also, we empirically analyze the efficiency of our
solution through various experiments.Comment: 29 pages, 2 figures, 3 tables arXiv admin note: substantial text
overlap with arXiv:1307.482
Secure -ish Nearest Neighbors Classifier
In machine learning, classifiers are used to predict a class of a given query
based on an existing (classified) database. Given a database S of n
d-dimensional points and a d-dimensional query q, the k-nearest neighbors (kNN)
classifier assigns q with the majority class of its k nearest neighbors in S.
In the secure version of kNN, S and q are owned by two different parties that
do not want to share their data. Unfortunately, all known solutions for secure
kNN either require a large communication complexity between the parties, or are
very inefficient to run.
In this work we present a classifier based on kNN, that can be implemented
efficiently with homomorphic encryption (HE). The efficiency of our classifier
comes from a relaxation we make on kNN, where we allow it to consider kappa
nearest neighbors for kappa ~ k with some probability. We therefore call our
classifier k-ish Nearest Neighbors (k-ish NN).
The success probability of our solution depends on the distribution of the
distances from q to S and increase as its statistical distance to Gaussian
decrease.
To implement our classifier we introduce the concept of double-blinded
coin-toss. In a doubly-blinded coin-toss the success probability as well as the
output of the toss are encrypted. We use this coin-toss to efficiently
approximate the average and variance of the distances from q to S. We believe
these two techniques may be of independent interest.
When implemented with HE, the k-ish NN has a circuit depth that is
independent of n, therefore making it scalable. We also implemented our
classifier in an open source library based on HELib and tested it on a breast
tumor database. The accuracy of our classifier (F_1 score) were 98\% and
classification took less than 3 hours compared to (estimated) weeks in current
HE implementations
Practical and fully secure multi keyword ranked search over encrypted data with lightweight client
Cloud computing offers computing services such as data storage and computing power and relieves its users of the burden of their direct management. While being extremely convenient, therefore immensely popular, cloud computing instigates concerns of privacy of outsourced data, for which conventional encryption is hardly a solution as the data is meant to be accessed, used and processed in an efficient manner. Multi keyword ranked search over encrypted data (MRSE) is a special form of secure searchable encryption (SSE), which lets users to privately find out the most similar documents to a given query using document representation methods such as tf-idf vectors and metrics such as cosine similarity. In this work, we propose a secure MRSE scheme that makes use of both a new secure k-NN algorithm and somewhat homomorphic encryption (SWHE). The scheme provides data, query and search pattern privacy and is amenable to access pattern privacy. We provide a formal security analysis of the secure k-NN algorithm and rely on IND-CPA security of the SWHE scheme to meet the strong privacy claims. The scheme provides speedup of about two orders of magnitude over the privacy-preserving MRSE schemes using only SWHE while its overall performance is comparable to other schemes in the literature with weaker forms of privacy claims. We present implementations results including one from the literature pertaining to response times, storage and bandwidth requirements and show that the scheme facilitates a lightweight client implementation
- …