253 research outputs found

    SANNS: Scaling Up Secure Approximate k-Nearest Neighbors Search

    Get PDF
    The kk-Nearest Neighbor Search (kk-NNS) is the backbone of several cloud-based services such as recommender systems, face recognition, and database search on text and images. In these services, the client sends the query to the cloud server and receives the response in which case the query and response are revealed to the service provider. Such data disclosures are unacceptable in several scenarios due to the sensitivity of data and/or privacy laws. In this paper, we introduce SANNS, a system for secure kk-NNS that keeps client's query and the search result confidential. SANNS comprises two protocols: an optimized linear scan and a protocol based on a novel sublinear time clustering-based algorithm. We prove the security of both protocols in the standard semi-honest model. The protocols are built upon several state-of-the-art cryptographic primitives such as lattice-based additively homomorphic encryption, distributed oblivious RAM, and garbled circuits. We provide several contributions to each of these primitives which are applicable to other secure computation tasks. Both of our protocols rely on a new circuit for the approximate top-kk selection from nn numbers that is built from O(n+k2)O(n + k^2) comparators. We have implemented our proposed system and performed extensive experimental results on four datasets in two different computation environments, demonstrating more than 18−31×18-31\times faster response time compared to optimally implemented protocols from the prior work. Moreover, SANNS is the first work that scales to the database of 10 million entries, pushing the limit by more than two orders of magnitude.Comment: 18 pages, to appear at USENIX Security Symposium 202

    Don't forget private retrieval: distributed private similarity search for large language models

    Full text link
    While the flexible capabilities of large language models (LLMs) allow them to answer a range of queries based on existing learned knowledge, information retrieval to augment generation is an important tool to allow LLMs to answer questions on information not included in pre-training data. Such private information is increasingly being generated in a wide array of distributed contexts by organizations and individuals. Performing such information retrieval using neural embeddings of queries and documents always leaked information about queries and database content unless both were stored locally. We present Private Retrieval Augmented Generation (PRAG), an approach that uses multi-party computation (MPC) to securely transmit queries to a distributed set of servers containing a privately constructed database to return top-k and approximate top-k documents. This is a first-of-its-kind approach to dense information retrieval that ensures no server observes a client's query or can see the database content. The approach introduces a novel MPC friendly protocol for inverted file approximate search (IVF) that allows for fast document search over distributed and private data in sublinear communication complexity. This work presents new avenues through which data for use in LLMs can be accessed and used without needing to centralize or forgo privacy

    Batched differentially private information retrieval

    Full text link
    Private Information Retrieval (PIR) allows several clients to query a database held by one or more servers, such that the contents of their queries remain private. Prior PIR schemes have achieved sublinear communication and computation by leveraging computational assumptions, federating trust among many servers, relaxing security to permit differentially private leakage, refactoring effort into an offline stage to reduce online costs, or amortizing costs over a large batch of queries. In this work, we present an efficient PIR protocol that combines all of the above techniques to achieve constant amortized communication and computation complexity in the size of the database and constant client work. We leverage differentially private leakage in order to provide better trade-offs between privacy and efficiency. Our protocol achieves speed-ups up to and exceeding 10x in practical settings compared to state of the art PIR protocols, and can scale to batches with hundreds of millions of queries on cheap commodity AWS machines. Our protocol builds upon a new secret sharing scheme that is both incremental and non-malleable, which may be of interest to a wider audience. Our protocol provides security up to abort against malicious adversaries that can corrupt all but one party.1414119 - National Science Foundation; CNS-1718135 - National Science Foundation; CNS-1931714 - National Science Foundation; HR00112020021 - Department of Defense/DARPA; 000000000000000000000000000000000000000000000000000000037211 - SRI Internationalhttps://www.usenix.org/system/files/sec22-albab.pdfPublished versio

    Scaling Mobile Private Contact Discovery to Billions of Users

    Get PDF
    Mobile contact discovery is a convenience feature of messengers such as WhatsApp or Telegram that helps users to identify which of their existing contacts are registered with the service. Unfortunately, the contact discovery implementation of many popular messengers massively violates the users\u27 privacy as demonstrated by Hagen et al. (NDSS \u2721, ACM TOPS \u2723). Unbalanced private set intersection (PSI) protocols are a promising cryptographic solution to realize mobile private contact discovery, however, state-of-the-art protocols do not scale to real-world database sizes with billions of registered users in terms of communication and/or computation overhead. In our work, we make significant steps towards truly practical large-scale mobile private contact discovery. For this, we combine and substantially optimize the unbalanced PSI protocol of Kales et al. (USENIX Security \u2719) and the private information retrieval (PIR) protocol of Kogan and Corrigan-Gibbs (USENIX Security \u2721). Our resulting protocol has a total communication overhead that is sublinear in the size of the server\u27s user database and also has sublinear online runtimes. We optimize our protocol by introducing database partitioning and efficient scheduling of user queries. To handle realistic change rates of databases and contact lists, we propose and evaluate different possibilities for efficient updates. We implement our protocol on smartphones and measure online runtimes of less than 2s to query up to 1024 contacts from a database with more than two billion entries. Furthermore, we achieve a reduction in setup communication up to factor 32x compared to state-of-the-art mobile private contact discovery protocols

    Piano: Extremely Simple, Single-Server PIR with Sublinear Server Computation

    Get PDF
    We construct a sublinear-time single-server pre-processing Private Information Retrieval (PIR) scheme with optimal client storage and server computation (up to poly-logarithmic factors), only relying on the assumption of the existence of One Way Functions (OWF). Our scheme achieves amortized O~(n)\tilde{O}(\sqrt{n}) online server computation and client computation and O(n)O(\sqrt{n}) online communication per query, and requires O~λ(n)\widetilde{O}_\lambda(\sqrt{n}) client storage. Unlike prior single-server PIR schemes that rely on heavy cryptographic machinery such as Homomorphic Encryption, our scheme only utilizes lightweight cryptography such as PRFs, which is easily instantiated in practice. To our knowledge, this is the first practical implementation of a single-server sublinear-time PIR scheme. Compared to existing linear time single-server solutions, our schemes are faster by 10−300×10-300\times and are comparable to the fastest two-server schemes. In particular, for a 100GB database of 1.6 billion entries, our experiments show that our scheme has less than 40ms online computation time on a single core

    GPU-based Private Information Retrieval for On-Device Machine Learning Inference

    Full text link
    On-device machine learning (ML) inference can enable the use of private user data on user devices without revealing them to remote servers. However, a pure on-device solution to private ML inference is impractical for many applications that rely on embedding tables that are too large to be stored on-device. In particular, recommendation models typically use multiple embedding tables each on the order of 1-10 GBs of data, making them impractical to store on-device. To overcome this barrier, we propose the use of private information retrieval (PIR) to efficiently and privately retrieve embeddings from servers without sharing any private information. As off-the-shelf PIR algorithms are usually too computationally intensive to directly use for latency-sensitive inference tasks, we 1) propose novel GPU-based acceleration of PIR, and 2) co-design PIR with the downstream ML application to obtain further speedup. Our GPU acceleration strategy improves system throughput by more than 20×20 \times over an optimized CPU PIR implementation, and our PIR-ML co-design provides an over 5×5 \times additional throughput improvement at fixed model quality. Together, for various on-device ML applications such as recommendation and language modeling, our system on a single V100 GPU can serve up to 100,000100,000 queries per second -- a >100×>100 \times throughput improvement over a CPU-based baseline -- while maintaining model accuracy
    • …