253 research outputs found
SANNS: Scaling Up Secure Approximate k-Nearest Neighbors Search
The -Nearest Neighbor Search (-NNS) is the backbone of several
cloud-based services such as recommender systems, face recognition, and
database search on text and images. In these services, the client sends the
query to the cloud server and receives the response in which case the query and
response are revealed to the service provider. Such data disclosures are
unacceptable in several scenarios due to the sensitivity of data and/or privacy
laws.
In this paper, we introduce SANNS, a system for secure -NNS that keeps
client's query and the search result confidential. SANNS comprises two
protocols: an optimized linear scan and a protocol based on a novel sublinear
time clustering-based algorithm. We prove the security of both protocols in the
standard semi-honest model. The protocols are built upon several
state-of-the-art cryptographic primitives such as lattice-based additively
homomorphic encryption, distributed oblivious RAM, and garbled circuits. We
provide several contributions to each of these primitives which are applicable
to other secure computation tasks. Both of our protocols rely on a new circuit
for the approximate top- selection from numbers that is built from comparators.
We have implemented our proposed system and performed extensive experimental
results on four datasets in two different computation environments,
demonstrating more than faster response time compared to
optimally implemented protocols from the prior work. Moreover, SANNS is the
first work that scales to the database of 10 million entries, pushing the limit
by more than two orders of magnitude.Comment: 18 pages, to appear at USENIX Security Symposium 202
Don't forget private retrieval: distributed private similarity search for large language models
While the flexible capabilities of large language models (LLMs) allow them to
answer a range of queries based on existing learned knowledge, information
retrieval to augment generation is an important tool to allow LLMs to answer
questions on information not included in pre-training data. Such private
information is increasingly being generated in a wide array of distributed
contexts by organizations and individuals. Performing such information
retrieval using neural embeddings of queries and documents always leaked
information about queries and database content unless both were stored locally.
We present Private Retrieval Augmented Generation (PRAG), an approach that uses
multi-party computation (MPC) to securely transmit queries to a distributed set
of servers containing a privately constructed database to return top-k and
approximate top-k documents. This is a first-of-its-kind approach to dense
information retrieval that ensures no server observes a client's query or can
see the database content. The approach introduces a novel MPC friendly protocol
for inverted file approximate search (IVF) that allows for fast document search
over distributed and private data in sublinear communication complexity. This
work presents new avenues through which data for use in LLMs can be accessed
and used without needing to centralize or forgo privacy
Batched differentially private information retrieval
Private Information Retrieval (PIR) allows several clients to query a database held by one or more servers, such that the contents of their queries remain private. Prior PIR schemes have achieved sublinear communication and computation by leveraging computational assumptions, federating trust among many servers, relaxing security to permit differentially private leakage, refactoring effort into an offline stage to reduce online costs, or amortizing costs over a large batch of queries.
In this work, we present an efficient PIR protocol that combines all of the above techniques to achieve constant amortized communication and computation complexity in the size of the database and constant client work. We leverage differentially private leakage in order to provide better trade-offs between privacy and efficiency. Our protocol achieves speed-ups up to and exceeding 10x in practical settings compared to state of the art PIR protocols, and can scale to batches with hundreds of millions of queries on cheap commodity AWS machines. Our protocol builds upon a new secret sharing scheme that is both incremental and non-malleable, which may be of interest to a wider audience. Our protocol provides security up to abort against malicious adversaries that can corrupt all but one party.1414119 - National Science Foundation; CNS-1718135 - National Science Foundation; CNS-1931714 - National Science Foundation; HR00112020021 - Department of Defense/DARPA; 000000000000000000000000000000000000000000000000000000037211 - SRI Internationalhttps://www.usenix.org/system/files/sec22-albab.pdfPublished versio
Scaling Mobile Private Contact Discovery to Billions of Users
Mobile contact discovery is a convenience feature of messengers such as WhatsApp or Telegram that helps users to identify which of their existing contacts are registered with the service. Unfortunately, the contact discovery implementation of many popular messengers massively violates the users\u27 privacy as demonstrated by Hagen et al. (NDSS \u2721, ACM TOPS \u2723). Unbalanced private set intersection (PSI) protocols are a promising cryptographic solution to realize mobile private contact discovery, however, state-of-the-art protocols do not scale to real-world database sizes with billions of registered users in terms of communication and/or computation overhead.
In our work, we make significant steps towards truly practical large-scale mobile private contact discovery. For this, we combine and substantially optimize the unbalanced PSI protocol of Kales et al. (USENIX Security \u2719) and the private information retrieval (PIR) protocol of Kogan and Corrigan-Gibbs (USENIX Security \u2721). Our resulting protocol has a total communication overhead that is sublinear in the size of the server\u27s user database and also has sublinear online runtimes. We optimize our protocol by introducing database partitioning and efficient scheduling of user queries. To handle realistic change rates of databases and contact lists, we propose and evaluate different possibilities for efficient updates. We implement our protocol on smartphones and measure online runtimes of less than 2s to query up to 1024 contacts from a database with more than two billion entries. Furthermore, we achieve a reduction in setup communication up to factor 32x compared to state-of-the-art mobile private contact discovery protocols
Piano: Extremely Simple, Single-Server PIR with Sublinear Server Computation
We construct a sublinear-time single-server pre-processing Private Information Retrieval
(PIR) scheme with optimal client storage and server computation (up to poly-logarithmic factors), only relying on the assumption of the existence of One Way Functions (OWF). Our scheme achieves amortized online server computation and client computation and
online communication per query, and requires client storage. Unlike prior single-server PIR schemes that rely on heavy cryptographic machinery such as Homomorphic Encryption, our scheme only utilizes lightweight cryptography such as PRFs, which is easily instantiated in practice. To our knowledge, this is the first practical implementation of a single-server sublinear-time PIR scheme.
Compared to existing linear time single-server solutions, our schemes are faster by and are comparable to the fastest two-server schemes. In particular, for a 100GB database of 1.6 billion entries, our experiments show that our scheme has less than 40ms online computation time on a single core
GPU-based Private Information Retrieval for On-Device Machine Learning Inference
On-device machine learning (ML) inference can enable the use of private user
data on user devices without revealing them to remote servers. However, a pure
on-device solution to private ML inference is impractical for many applications
that rely on embedding tables that are too large to be stored on-device. In
particular, recommendation models typically use multiple embedding tables each
on the order of 1-10 GBs of data, making them impractical to store on-device.
To overcome this barrier, we propose the use of private information retrieval
(PIR) to efficiently and privately retrieve embeddings from servers without
sharing any private information. As off-the-shelf PIR algorithms are usually
too computationally intensive to directly use for latency-sensitive inference
tasks, we 1) propose novel GPU-based acceleration of PIR, and 2) co-design PIR
with the downstream ML application to obtain further speedup. Our GPU
acceleration strategy improves system throughput by more than over
an optimized CPU PIR implementation, and our PIR-ML co-design provides an over
additional throughput improvement at fixed model quality. Together,
for various on-device ML applications such as recommendation and language
modeling, our system on a single V100 GPU can serve up to queries per
second -- a throughput improvement over a CPU-based baseline --
while maintaining model accuracy
- …