Local differential privacy has recently surfaced as a strong measure of
privacy in contexts where personal information remains private even from data
analysts. Working in a setting where both the data providers and data analysts
want to maximize the utility of statistical analyses performed on the released
data, we study the fundamental trade-off between local differential privacy and
utility. This trade-off is formulated as a constrained optimization problem:
maximize utility subject to local differential privacy constraints. We
introduce a combinatorial family of extremal privatization mechanisms, which we
call staircase mechanisms, and show that it contains the optimal privatization
mechanisms for a broad class of information theoretic utilities such as mutual
information and f-divergences. We further prove that for any utility function
and any privacy level, solving the privacy-utility maximization problem is
equivalent to solving a finite-dimensional linear program, the outcome of which
is the optimal staircase mechanism. However, solving this linear program can be
computationally expensive since it has a number of variables that is
exponential in the size of the alphabet the data lives in. To account for this,
we show that two simple privatization mechanisms, the binary and randomized
response mechanisms, are universally optimal in the low and high privacy
regimes, and well approximate the intermediate regime.Comment: 52 pages, 10 figures in JMLR 201