248 research outputs found

    The Optimal Mechanism in Differential Privacy

    Full text link
    We derive the optimal ϵ\epsilon-differentially private mechanism for single real-valued query function under a very general utility-maximization (or cost-minimization) framework. The class of noise probability distributions in the optimal mechanism has {\em staircase-shaped} probability density functions which are symmetric (around the origin), monotonically decreasing and geometrically decaying. The staircase mechanism can be viewed as a {\em geometric mixture of uniform probability distributions}, providing a simple algorithmic description for the mechanism. Furthermore, the staircase mechanism naturally generalizes to discrete query output settings as well as more abstract settings. We explicitly derive the optimal noise probability distributions with minimum expectation of noise amplitude and power. Comparing the optimal performances with those of the Laplacian mechanism, we show that in the high privacy regime (ϵ\epsilon is small), Laplacian mechanism is asymptotically optimal as ϵ0\epsilon \to 0; in the low privacy regime (ϵ\epsilon is large), the minimum expectation of noise amplitude and minimum noise power are Θ(Δeϵ2)\Theta(\Delta e^{-\frac{\epsilon}{2}}) and Θ(Δ2e2ϵ3)\Theta(\Delta^2 e^{-\frac{2\epsilon}{3}}) as ϵ+\epsilon \to +\infty, while the expectation of noise amplitude and power using the Laplacian mechanism are Δϵ\frac{\Delta}{\epsilon} and 2Δ2ϵ2\frac{2\Delta^2}{\epsilon^2}, where Δ\Delta is the sensitivity of the query function. We conclude that the gains are more pronounced in the low privacy regime.Comment: 40 pages, 5 figures. Part of this work was presented in DIMACS Workshop on Recent Work on Differential Privacy across Computer Science, October 24 - 26, 201

    Demystifying Fixed k-Nearest Neighbor Information Estimators

    Full text link
    Estimating mutual information from i.i.d. samples drawn from an unknown joint density function is a basic statistical problem of broad interest with multitudinous applications. The most popular estimator is one proposed by Kraskov and St\"ogbauer and Grassberger (KSG) in 2004, and is nonparametric and based on the distances of each sample to its kthk^{\rm th} nearest neighboring sample, where kk is a fixed small integer. Despite its widespread use (part of scientific software packages), theoretical properties of this estimator have been largely unexplored. In this paper we demonstrate that the estimator is consistent and also identify an upper bound on the rate of convergence of the bias as a function of number of samples. We argue that the superior performance benefits of the KSG estimator stems from a curious "correlation boosting" effect and build on this intuition to modify the KSG estimator in novel ways to construct a superior estimator. As a byproduct of our investigations, we obtain nearly tight rates of convergence of the 2\ell_2 error of the well known fixed kk nearest neighbor estimator of differential entropy by Kozachenko and Leonenko.Comment: 55 pages, 8 figure

    Capacity of Fading Gaussian Channel with an Energy Harvesting Sensor Node

    Full text link
    Network life time maximization is becoming an important design goal in wireless sensor networks. Energy harvesting has recently become a preferred choice for achieving this goal as it provides near perpetual operation. We study such a sensor node with an energy harvesting source and compare various architectures by which the harvested energy is used. We find its Shannon capacity when it is transmitting its observations over a fading AWGN channel with perfect/no channel state information provided at the transmitter. We obtain an achievable rate when there are inefficiencies in energy storage and the capacity when energy is spent in activities other than transmission.Comment: 6 Pages, To be presented at IEEE GLOBECOM 201

    MORSE: Semantic-ally Drive-n MORpheme SEgment-er

    Full text link
    We present in this paper a novel framework for morpheme segmentation which uses the morpho-syntactic regularities preserved by word representations, in addition to orthographic features, to segment words into morphemes. This framework is the first to consider vocabulary-wide syntactico-semantic information for this task. We also analyze the deficiencies of available benchmarking datasets and introduce our own dataset that was created on the basis of compositionality. We validate our algorithm across datasets and present state-of-the-art results

    Extremal Mechanisms for Local Differential Privacy

    Full text link
    Local differential privacy has recently surfaced as a strong measure of privacy in contexts where personal information remains private even from data analysts. Working in a setting where both the data providers and data analysts want to maximize the utility of statistical analyses performed on the released data, we study the fundamental trade-off between local differential privacy and utility. This trade-off is formulated as a constrained optimization problem: maximize utility subject to local differential privacy constraints. We introduce a combinatorial family of extremal privatization mechanisms, which we call staircase mechanisms, and show that it contains the optimal privatization mechanisms for a broad class of information theoretic utilities such as mutual information and ff-divergences. We further prove that for any utility function and any privacy level, solving the privacy-utility maximization problem is equivalent to solving a finite-dimensional linear program, the outcome of which is the optimal staircase mechanism. However, solving this linear program can be computationally expensive since it has a number of variables that is exponential in the size of the alphabet the data lives in. To account for this, we show that two simple privatization mechanisms, the binary and randomized response mechanisms, are universally optimal in the low and high privacy regimes, and well approximate the intermediate regime.Comment: 52 pages, 10 figures in JMLR 201
    corecore