1,833 research outputs found
Exploring Privacy Preservation in Outsourced K-Nearest Neighbors with Multiple Data Owners
The k-nearest neighbors (k-NN) algorithm is a popular and effective
classification algorithm. Due to its large storage and computational
requirements, it is suitable for cloud outsourcing. However, k-NN is often run
on sensitive data such as medical records, user images, or personal
information. It is important to protect the privacy of data in an outsourced
k-NN system.
Prior works have all assumed the data owners (who submit data to the
outsourced k-NN system) are a single trusted party. However, we observe that in
many practical scenarios, there may be multiple mutually distrusting data
owners. In this work, we present the first framing and exploration of privacy
preservation in an outsourced k-NN system with multiple data owners. We
consider the various threat models introduced by this modification. We discover
that under a particularly practical threat model that covers numerous
scenarios, there exists a set of adaptive attacks that breach the data privacy
of any exact k-NN system. The vulnerability is a result of the mathematical
properties of k-NN and its output. Thus, we propose a privacy-preserving
alternative system supporting kernel density estimation using a Gaussian
kernel, a classification algorithm from the same family as k-NN. In many
applications, this similar algorithm serves as a good substitute for k-NN. We
additionally investigate solutions for other threat models, often through
extensions on prior single data owner systems
Lower Bounds for Oblivious Near-Neighbor Search
We prove an lower bound on the dynamic
cell-probe complexity of statistically
approximate-near-neighbor search () over the -dimensional
Hamming cube. For the natural setting of , our result
implies an lower bound, which is a quadratic
improvement over the highest (non-oblivious) cell-probe lower bound for
. This is the first super-logarithmic
lower bound for against general (non black-box) data structures.
We also show that any oblivious data structure for
decomposable search problems (like ) can be obliviously dynamized
with overhead in update and query time, strengthening a classic
result of Bentley and Saxe (Algorithmica, 1980).Comment: 28 page
SANNS: Scaling Up Secure Approximate k-Nearest Neighbors Search
The -Nearest Neighbor Search (-NNS) is the backbone of several
cloud-based services such as recommender systems, face recognition, and
database search on text and images. In these services, the client sends the
query to the cloud server and receives the response in which case the query and
response are revealed to the service provider. Such data disclosures are
unacceptable in several scenarios due to the sensitivity of data and/or privacy
laws.
In this paper, we introduce SANNS, a system for secure -NNS that keeps
client's query and the search result confidential. SANNS comprises two
protocols: an optimized linear scan and a protocol based on a novel sublinear
time clustering-based algorithm. We prove the security of both protocols in the
standard semi-honest model. The protocols are built upon several
state-of-the-art cryptographic primitives such as lattice-based additively
homomorphic encryption, distributed oblivious RAM, and garbled circuits. We
provide several contributions to each of these primitives which are applicable
to other secure computation tasks. Both of our protocols rely on a new circuit
for the approximate top- selection from numbers that is built from comparators.
We have implemented our proposed system and performed extensive experimental
results on four datasets in two different computation environments,
demonstrating more than faster response time compared to
optimally implemented protocols from the prior work. Moreover, SANNS is the
first work that scales to the database of 10 million entries, pushing the limit
by more than two orders of magnitude.Comment: 18 pages, to appear at USENIX Security Symposium 202
Shortest Path Computation with No Information Leakage
Shortest path computation is one of the most common queries in location-based
services (LBSs). Although particularly useful, such queries raise serious
privacy concerns. Exposing to a (potentially untrusted) LBS the client's
position and her destination may reveal personal information, such as social
habits, health condition, shopping preferences, lifestyle choices, etc. The
only existing method for privacy-preserving shortest path computation follows
the obfuscation paradigm; it prevents the LBS from inferring the source and
destination of the query with a probability higher than a threshold. This
implies, however, that the LBS still deduces some information (albeit not
exact) about the client's location and her destination. In this paper we aim at
strong privacy, where the adversary learns nothing about the shortest path
query. We achieve this via established private information retrieval
techniques, which we treat as black-box building blocks. Experiments on real,
large-scale road networks assess the practicality of our schemes.Comment: VLDB201
Building Confidential and Efficient Query Services in the Cloud with RASP Data Perturbation
With the wide deployment of public cloud computing infrastructures, using
clouds to host data query services has become an appealing solution for the
advantages on scalability and cost-saving. However, some data might be
sensitive that the data owner does not want to move to the cloud unless the
data confidentiality and query privacy are guaranteed. On the other hand, a
secured query service should still provide efficient query processing and
significantly reduce the in-house workload to fully realize the benefits of
cloud computing. We propose the RASP data perturbation method to provide secure
and efficient range query and kNN query services for protected data in the
cloud. The RASP data perturbation method combines order preserving encryption,
dimensionality expansion, random noise injection, and random projection, to
provide strong resilience to attacks on the perturbed data and queries. It also
preserves multidimensional ranges, which allows existing indexing techniques to
be applied to speedup range query processing. The kNN-R algorithm is designed
to work with the RASP range query algorithm to process the kNN queries. We have
carefully analyzed the attacks on data and queries under a precisely defined
threat model and realistic security assumptions. Extensive experiments have
been conducted to show the advantages of this approach on efficiency and
security.Comment: 18 pages, to appear in IEEE TKDE, accepted in December 201
- …