248 research outputs found
The Optimal Mechanism in Differential Privacy
We derive the optimal -differentially private mechanism for single
real-valued query function under a very general utility-maximization (or
cost-minimization) framework. The class of noise probability distributions in
the optimal mechanism has {\em staircase-shaped} probability density functions
which are symmetric (around the origin), monotonically decreasing and
geometrically decaying. The staircase mechanism can be viewed as a {\em
geometric mixture of uniform probability distributions}, providing a simple
algorithmic description for the mechanism. Furthermore, the staircase mechanism
naturally generalizes to discrete query output settings as well as more
abstract settings. We explicitly derive the optimal noise probability
distributions with minimum expectation of noise amplitude and power. Comparing
the optimal performances with those of the Laplacian mechanism, we show that in
the high privacy regime ( is small), Laplacian mechanism is
asymptotically optimal as ; in the low privacy regime
( is large), the minimum expectation of noise amplitude and minimum
noise power are and as , while the expectation of
noise amplitude and power using the Laplacian mechanism are
and , where is
the sensitivity of the query function. We conclude that the gains are more
pronounced in the low privacy regime.Comment: 40 pages, 5 figures. Part of this work was presented in DIMACS
Workshop on Recent Work on Differential Privacy across Computer Science,
October 24 - 26, 201
Demystifying Fixed k-Nearest Neighbor Information Estimators
Estimating mutual information from i.i.d. samples drawn from an unknown joint
density function is a basic statistical problem of broad interest with
multitudinous applications. The most popular estimator is one proposed by
Kraskov and St\"ogbauer and Grassberger (KSG) in 2004, and is nonparametric and
based on the distances of each sample to its nearest neighboring
sample, where is a fixed small integer. Despite its widespread use (part of
scientific software packages), theoretical properties of this estimator have
been largely unexplored. In this paper we demonstrate that the estimator is
consistent and also identify an upper bound on the rate of convergence of the
bias as a function of number of samples. We argue that the superior performance
benefits of the KSG estimator stems from a curious "correlation boosting"
effect and build on this intuition to modify the KSG estimator in novel ways to
construct a superior estimator. As a byproduct of our investigations, we obtain
nearly tight rates of convergence of the error of the well known fixed
nearest neighbor estimator of differential entropy by Kozachenko and
Leonenko.Comment: 55 pages, 8 figure
Capacity of Fading Gaussian Channel with an Energy Harvesting Sensor Node
Network life time maximization is becoming an important design goal in
wireless sensor networks. Energy harvesting has recently become a preferred
choice for achieving this goal as it provides near perpetual operation. We
study such a sensor node with an energy harvesting source and compare various
architectures by which the harvested energy is used. We find its Shannon
capacity when it is transmitting its observations over a fading AWGN channel
with perfect/no channel state information provided at the transmitter. We
obtain an achievable rate when there are inefficiencies in energy storage and
the capacity when energy is spent in activities other than transmission.Comment: 6 Pages, To be presented at IEEE GLOBECOM 201
MORSE: Semantic-ally Drive-n MORpheme SEgment-er
We present in this paper a novel framework for morpheme segmentation which
uses the morpho-syntactic regularities preserved by word representations, in
addition to orthographic features, to segment words into morphemes. This
framework is the first to consider vocabulary-wide syntactico-semantic
information for this task. We also analyze the deficiencies of available
benchmarking datasets and introduce our own dataset that was created on the
basis of compositionality. We validate our algorithm across datasets and
present state-of-the-art results
Extremal Mechanisms for Local Differential Privacy
Local differential privacy has recently surfaced as a strong measure of
privacy in contexts where personal information remains private even from data
analysts. Working in a setting where both the data providers and data analysts
want to maximize the utility of statistical analyses performed on the released
data, we study the fundamental trade-off between local differential privacy and
utility. This trade-off is formulated as a constrained optimization problem:
maximize utility subject to local differential privacy constraints. We
introduce a combinatorial family of extremal privatization mechanisms, which we
call staircase mechanisms, and show that it contains the optimal privatization
mechanisms for a broad class of information theoretic utilities such as mutual
information and -divergences. We further prove that for any utility function
and any privacy level, solving the privacy-utility maximization problem is
equivalent to solving a finite-dimensional linear program, the outcome of which
is the optimal staircase mechanism. However, solving this linear program can be
computationally expensive since it has a number of variables that is
exponential in the size of the alphabet the data lives in. To account for this,
we show that two simple privatization mechanisms, the binary and randomized
response mechanisms, are universally optimal in the low and high privacy
regimes, and well approximate the intermediate regime.Comment: 52 pages, 10 figures in JMLR 201
- …
