5,515 research outputs found
Distributed Private Heavy Hitters
In this paper, we give efficient algorithms and lower bounds for solving the
heavy hitters problem while preserving differential privacy in the fully
distributed local model. In this model, there are n parties, each of which
possesses a single element from a universe of size N. The heavy hitters problem
is to find the identity of the most common element shared amongst the n
parties. In the local model, there is no trusted database administrator, and so
the algorithm must interact with each of the parties separately, using a
differentially private protocol. We give tight information-theoretic upper and
lower bounds on the accuracy to which this problem can be solved in the local
model (giving a separation between the local model and the more common
centralized model of privacy), as well as computationally efficient algorithms
even in the case where the data universe N may be exponentially large
POPSTAR: Lightweight Threshold Reporting with Reduced Leakage
This paper proposes POPSTAR, a new lightweight protocol for the private computation of heavy hitters, also known as a private threshold reporting system. In such a protocol, the users provide input measurements, and a report server learns which measurements appear more than a pre-specified threshold. POPSTAR follows the same architecture as STAR (Davidson et al, CCS 2022) by relying on a helper randomness server in addition to a main server computing the aggregate heavy hitter statistics. While STAR is extremely lightweight, it leaks a substantial amount of information, consisting of an entire histogram of the provided measurements (but only reveals the actual measurements that appear beyond the threshold). POPSTAR shows that this leakage can be reduced at a modest cost (7 longer aggregation time). Our leakage is closer to that of Poplar (Boneh et al, S&P 2021), which relies however on distributed point functions and a different model which requires interactions of two non-colluding servers (with equal workloads) to compute the heavy hitters
Lightweight Techniques for Private Heavy Hitters
This paper presents a new protocol for solving the private heavy-hitters
problem. In this problem, there are many clients and a small set of
data-collection servers. Each client holds a private bitstring. The servers
want to recover the set of all popular strings, without learning anything else
about any client's string. A web-browser vendor, for instance, can use our
protocol to figure out which homepages are popular, without learning any user's
homepage. We also consider the simpler private subset-histogram problem, in
which the servers want to count how many clients hold strings in a particular
set without revealing this set to the clients.
Our protocols use two data-collection servers and, in a protocol run, each
client send sends only a single message to the servers. Our protocols protect
client privacy against arbitrary misbehavior by one of the servers and our
approach requires no public-key cryptography (except for secure channels), nor
general-purpose multiparty computation. Instead, we rely on incremental
distributed point functions, a new cryptographic tool that allows a client to
succinctly secret-share the labels on the nodes of an exponentially large
binary tree, provided that the tree has a single non-zero path. Along the way,
we develop new general tools for providing malicious security in applications
of distributed point functions.
In an experimental evaluation with two servers on opposite sides of the U.S.,
the servers can find the 200 most popular strings among a set of 400,000
client-held 256-bit strings in 54 minutes. Our protocols are highly
parallelizable. We estimate that with 20 physical machines per logical server,
our protocols could compute heavy hitters over ten million clients in just over
one hour of computation.Comment: To appear in IEEE Security & Privacy 202
Heavy Hitters and the Structure of Local Privacy
We present a new locally differentially private algorithm for the heavy
hitters problem which achieves optimal worst-case error as a function of all
standardly considered parameters. Prior work obtained error rates which depend
optimally on the number of users, the size of the domain, and the privacy
parameter, but depend sub-optimally on the failure probability.
We strengthen existing lower bounds on the error to incorporate the failure
probability, and show that our new upper bound is tight with respect to this
parameter as well. Our lower bound is based on a new understanding of the
structure of locally private protocols. We further develop these ideas to
obtain the following general results beyond heavy hitters.
Advanced Grouposition: In the local model, group privacy for
users degrades proportionally to , instead of linearly in
as in the central model. Stronger group privacy yields improved max-information
guarantees, as well as stronger lower bounds (via "packing arguments"), over
the central model.
Building on a transformation of Bassily and Smith (STOC 2015), we
give a generic transformation from any non-interactive approximate-private
local protocol into a pure-private local protocol. Again in contrast with the
central model, this shows that we cannot obtain more accurate algorithms by
moving from pure to approximate local privacy
- …