743 research outputs found

    Exploring Privacy Preservation in Outsourced K-Nearest Neighbors with Multiple Data Owners

    Full text link
    The k-nearest neighbors (k-NN) algorithm is a popular and effective classification algorithm. Due to its large storage and computational requirements, it is suitable for cloud outsourcing. However, k-NN is often run on sensitive data such as medical records, user images, or personal information. It is important to protect the privacy of data in an outsourced k-NN system. Prior works have all assumed the data owners (who submit data to the outsourced k-NN system) are a single trusted party. However, we observe that in many practical scenarios, there may be multiple mutually distrusting data owners. In this work, we present the first framing and exploration of privacy preservation in an outsourced k-NN system with multiple data owners. We consider the various threat models introduced by this modification. We discover that under a particularly practical threat model that covers numerous scenarios, there exists a set of adaptive attacks that breach the data privacy of any exact k-NN system. The vulnerability is a result of the mathematical properties of k-NN and its output. Thus, we propose a privacy-preserving alternative system supporting kernel density estimation using a Gaussian kernel, a classification algorithm from the same family as k-NN. In many applications, this similar algorithm serves as a good substitute for k-NN. We additionally investigate solutions for other threat models, often through extensions on prior single data owner systems

    Secure k-NN as a Service Over Encrypted Data in Multi-User Setting

    Full text link
    To securely leverage the advantages of Cloud Computing, recently a lot of research has happened in the area of "Secure Query Processing over Encrypted Data". As a concrete use case, many encryption schemes have been proposed for securely processing k Nearest Neighbors (SkNN) over encrypted data in the outsourced setting. Recently Zhu et al[25]. proposed a SkNN solution which claimed to satisfy following four properties: (1)Data Privacy, (2)Key Confidentiality, (3)Query Privacy, and (4)Query Controllability. However, in this paper, we present an attack which breaks the Query Controllability claim of their scheme. Further, we propose a new SkNN solution which satisfies all the four existing properties along with an additional essential property of Query Check Verification. We analyze the security of our proposed scheme and present the detailed experimental results to showcase the efficiency in real world scenario

    k-Nearest Neighbor Classification over Semantically Secure Encrypted Relational Data

    Full text link
    Data Mining has wide applications in many areas such as banking, medicine, scientific research and among government agencies. Classification is one of the commonly used tasks in data mining applications. For the past decade, due to the rise of various privacy issues, many theoretical and practical solutions to the classification problem have been proposed under different security models. However, with the recent popularity of cloud computing, users now have the opportunity to outsource their data, in encrypted form, as well as the data mining tasks to the cloud. Since the data on the cloud is in encrypted form, existing privacy preserving classification techniques are not applicable. In this paper, we focus on solving the classification problem over encrypted data. In particular, we propose a secure k-NN classifier over encrypted data in the cloud. The proposed k-NN protocol protects the confidentiality of the data, user's input query, and data access patterns. To the best of our knowledge, our work is the first to develop a secure k-NN classifier over encrypted data under the semi-honest model. Also, we empirically analyze the efficiency of our solution through various experiments.Comment: 29 pages, 2 figures, 3 tables arXiv admin note: substantial text overlap with arXiv:1307.482

    Secure kk-ish Nearest Neighbors Classifier

    Get PDF
    In machine learning, classifiers are used to predict a class of a given query based on an existing (classified) database. Given a database S of n d-dimensional points and a d-dimensional query q, the k-nearest neighbors (kNN) classifier assigns q with the majority class of its k nearest neighbors in S. In the secure version of kNN, S and q are owned by two different parties that do not want to share their data. Unfortunately, all known solutions for secure kNN either require a large communication complexity between the parties, or are very inefficient to run. In this work we present a classifier based on kNN, that can be implemented efficiently with homomorphic encryption (HE). The efficiency of our classifier comes from a relaxation we make on kNN, where we allow it to consider kappa nearest neighbors for kappa ~ k with some probability. We therefore call our classifier k-ish Nearest Neighbors (k-ish NN). The success probability of our solution depends on the distribution of the distances from q to S and increase as its statistical distance to Gaussian decrease. To implement our classifier we introduce the concept of double-blinded coin-toss. In a doubly-blinded coin-toss the success probability as well as the output of the toss are encrypted. We use this coin-toss to efficiently approximate the average and variance of the distances from q to S. We believe these two techniques may be of independent interest. When implemented with HE, the k-ish NN has a circuit depth that is independent of n, therefore making it scalable. We also implemented our classifier in an open source library based on HELib and tested it on a breast tumor database. The accuracy of our classifier (F_1 score) were 98\% and classification took less than 3 hours compared to (estimated) weeks in current HE implementations

    Practical and fully secure multi keyword ranked search over encrypted data with lightweight client

    Get PDF
    Cloud computing offers computing services such as data storage and computing power and relieves its users of the burden of their direct management. While being extremely convenient, therefore immensely popular, cloud computing instigates concerns of privacy of outsourced data, for which conventional encryption is hardly a solution as the data is meant to be accessed, used and processed in an efficient manner. Multi keyword ranked search over encrypted data (MRSE) is a special form of secure searchable encryption (SSE), which lets users to privately find out the most similar documents to a given query using document representation methods such as tf-idf vectors and metrics such as cosine similarity. In this work, we propose a secure MRSE scheme that makes use of both a new secure k-NN algorithm and somewhat homomorphic encryption (SWHE). The scheme provides data, query and search pattern privacy and is amenable to access pattern privacy. We provide a formal security analysis of the secure k-NN algorithm and rely on IND-CPA security of the SWHE scheme to meet the strong privacy claims. The scheme provides speedup of about two orders of magnitude over the privacy-preserving MRSE schemes using only SWHE while its overall performance is comparable to other schemes in the literature with weaker forms of privacy claims. We present implementations results including one from the literature pertaining to response times, storage and bandwidth requirements and show that the scheme facilitates a lightweight client implementation
    corecore