17,777 research outputs found

    SANNS: Scaling Up Secure Approximate k-Nearest Neighbors Search

    Get PDF
    The kk-Nearest Neighbor Search (kk-NNS) is the backbone of several cloud-based services such as recommender systems, face recognition, and database search on text and images. In these services, the client sends the query to the cloud server and receives the response in which case the query and response are revealed to the service provider. Such data disclosures are unacceptable in several scenarios due to the sensitivity of data and/or privacy laws. In this paper, we introduce SANNS, a system for secure kk-NNS that keeps client's query and the search result confidential. SANNS comprises two protocols: an optimized linear scan and a protocol based on a novel sublinear time clustering-based algorithm. We prove the security of both protocols in the standard semi-honest model. The protocols are built upon several state-of-the-art cryptographic primitives such as lattice-based additively homomorphic encryption, distributed oblivious RAM, and garbled circuits. We provide several contributions to each of these primitives which are applicable to other secure computation tasks. Both of our protocols rely on a new circuit for the approximate top-kk selection from nn numbers that is built from O(n+k2)O(n + k^2) comparators. We have implemented our proposed system and performed extensive experimental results on four datasets in two different computation environments, demonstrating more than 1831×18-31\times faster response time compared to optimally implemented protocols from the prior work. Moreover, SANNS is the first work that scales to the database of 10 million entries, pushing the limit by more than two orders of magnitude.Comment: 18 pages, to appear at USENIX Security Symposium 202

    Privacy-Preserving and Outsourced Multi-User k-Means Clustering

    Get PDF
    Many techniques for privacy-preserving data mining (PPDM) have been investigated over the past decade. Often, the entities involved in the data mining process are end-users or organizations with limited computing and storage resources. As a result, such entities may want to refrain from participating in the PPDM process. To overcome this issue and to take many other benefits of cloud computing, outsourcing PPDM tasks to the cloud environment has recently gained special attention. We consider the scenario where n entities outsource their databases (in encrypted format) to the cloud and ask the cloud to perform the clustering task on their combined data in a privacy-preserving manner. We term such a process as privacy-preserving and outsourced distributed clustering (PPODC). In this paper, we propose a novel and efficient solution to the PPODC problem based on k-means clustering algorithm. The main novelty of our solution lies in avoiding the secure division operations required in computing cluster centers altogether through an efficient transformation technique. Our solution builds the clusters securely in an iterative fashion and returns the final cluster centers to all entities when a pre-determined termination condition holds. The proposed solution protects data confidentiality of all the participating entities under the standard semi-honest model. To the best of our knowledge, ours is the first work to discuss and propose a comprehensive solution to the PPODC problem that incurs negligible cost on the participating entities. We theoretically estimate both the computation and communication costs of the proposed protocol and also demonstrate its practical value through experiments on a real dataset.Comment: 16 pages, 2 figures, 5 table

    Microdata protection through approximate microaggregation

    Get PDF
    Microdata protection is a hot topic in the field of Statistical Disclosure Control, which has gained special interest after the disclosure of 658000 queries by the America Online (AOL) search engine in August 2006. Many algorithms, methods and properties have been proposed to deal with microdata disclosure. One of the emerging concepts in microdata protection is k-anonymity, introduced by Samarati and Sweeney. k-anonymity provides a simple and efficient approach to protect private individual information and is gaining increasing popularity. k-anonymity requires that every record in the microdata table released be indistinguishably related to no fewer than k respondents. In this paper, we apply the concept of entropy to propose a distance metric to evaluate the amount of mutual information among records in microdata, and propose a method of constructing dependency tree to find the key attributes, which we then use to process approximate microaggregation. Further, we adopt this new microaggregation technique to study kk-anonymity problem, and an efficient algorithm is developed. Experimental results show that the proposed microaggregation technique is efficient and effective in the terms of running time and information loss

    Scalable secure multi-party network vulnerability analysis via symbolic optimization

    Full text link
    Threat propagation analysis is a valuable tool in improving the cyber resilience of enterprise networks. As these networks are interconnected and threats can propagate not only within but also across networks, a holistic view of the entire network can reveal threat propagation trajectories unobservable from within a single enterprise. However, companies are reluctant to share internal vulnerability measurement data as it is highly sensitive and (if leaked) possibly damaging. Secure Multi-Party Computation (MPC) addresses this concern. MPC is a cryptographic technique that allows distrusting parties to compute analytics over their joint data while protecting its confidentiality. In this work we apply MPC to threat propagation analysis on large, federated networks. To address the prohibitively high performance cost of general-purpose MPC we develop two novel applications of optimizations that can be leveraged to execute many relevant graph algorithms under MPC more efficiently: (1) dividing the computation into separate stages such that the first stage is executed privately by each party without MPC and the second stage is an MPC computation dealing with a much smaller shared network, and (2) optimizing the second stage by treating the execution of the analysis algorithm as a symbolic expression that can be optimized to reduce the number of costly operations and subsequently executed under MPC.We evaluate the scalability of this technique by analyzing the potential for threat propagation on examples of network graphs and propose several directions along which this work can be expanded

    Learning Character Strings via Mastermind Queries, with a Case Study Involving mtDNA

    Full text link
    We study the degree to which a character string, QQ, leaks details about itself any time it engages in comparison protocols with a strings provided by a querier, Bob, even if those protocols are cryptographically guaranteed to produce no additional information other than the scores that assess the degree to which QQ matches strings offered by Bob. We show that such scenarios allow Bob to play variants of the game of Mastermind with QQ so as to learn the complete identity of QQ. We show that there are a number of efficient implementations for Bob to employ in these Mastermind attacks, depending on knowledge he has about the structure of QQ, which show how quickly he can determine QQ. Indeed, we show that Bob can discover QQ using a number of rounds of test comparisons that is much smaller than the length of QQ, under reasonable assumptions regarding the types of scores that are returned by the cryptographic protocols and whether he can use knowledge about the distribution that QQ comes from. We also provide the results of a case study we performed on a database of mitochondrial DNA, showing the vulnerability of existing real-world DNA data to the Mastermind attack.Comment: Full version of related paper appearing in IEEE Symposium on Security and Privacy 2009, "The Mastermind Attack on Genomic Data." This version corrects the proofs of what are now Theorems 2 and 4
    corecore