11 research outputs found

    An Improved Private Mechanism for Small Databases

    Full text link
    We study the problem of answering a workload of linear queries Q\mathcal{Q}, on a database of size at most n=o(Q)n = o(|\mathcal{Q}|) drawn from a universe U\mathcal{U} under the constraint of (approximate) differential privacy. Nikolov, Talwar, and Zhang~\cite{NTZ} proposed an efficient mechanism that, for any given Q\mathcal{Q} and nn, answers the queries with average error that is at most a factor polynomial in logQ\log |\mathcal{Q}| and logU\log |\mathcal{U}| worse than the best possible. Here we improve on this guarantee and give a mechanism whose competitiveness ratio is at most polynomial in logn\log n and logU\log |\mathcal{U}|, and has no dependence on Q|\mathcal{Q}|. Our mechanism is based on the projection mechanism of Nikolov, Talwar, and Zhang, but in place of an ad-hoc noise distribution, we use a distribution which is in a sense optimal for the projection mechanism, and analyze it using convex duality and the restricted invertibility principle.Comment: To appear in ICALP 2015, Track

    MVG Mechanism: Differential Privacy under Matrix-Valued Query

    Full text link
    Differential privacy mechanism design has traditionally been tailored for a scalar-valued query function. Although many mechanisms such as the Laplace and Gaussian mechanisms can be extended to a matrix-valued query function by adding i.i.d. noise to each element of the matrix, this method is often suboptimal as it forfeits an opportunity to exploit the structural characteristics typically associated with matrix analysis. To address this challenge, we propose a novel differential privacy mechanism called the Matrix-Variate Gaussian (MVG) mechanism, which adds a matrix-valued noise drawn from a matrix-variate Gaussian distribution, and we rigorously prove that the MVG mechanism preserves (ϵ,δ)(\epsilon,\delta)-differential privacy. Furthermore, we introduce the concept of directional noise made possible by the design of the MVG mechanism. Directional noise allows the impact of the noise on the utility of the matrix-valued query function to be moderated. Finally, we experimentally demonstrate the performance of our mechanism using three matrix-valued queries on three privacy-sensitive datasets. We find that the MVG mechanism notably outperforms four previous state-of-the-art approaches, and provides comparable utility to the non-private baseline.Comment: Appeared in CCS'1

    Lower Bounds for Differential Privacy from Gaussian Width

    Get PDF
    We study the optimal sample complexity of a given workload of linear queries under the constraints of differential privacy. The sample complexity of a query answering mechanism under error parameter alpha is the smallest n such that the mechanism answers the workload with error at most alpha on any database of size n. Following a line of research started by Hardt and Talwar [STOC 2010], we analyze sample complexity using the tools of asymptotic convex geometry. We study the sensitivity polytope, a natural convex body associated with a query workload that quantifies how query answers can change between neighboring databases. This is the information that, roughly speaking, is protected by a differentially private algorithm, and, for this reason, we expect that a "bigger" sensitivity polytope implies larger sample complexity. Our results identify the mean Gaussian width as an appropriate measure of the size of the polytope, and show sample complexity lower bounds in terms of this quantity. Our lower bounds completely characterize the workloads for which the Gaussian noise mechanism is optimal up to constants as those having asymptotically maximal Gaussian width. Our techniques also yield an alternative proof of Pisier\u27s Volume Number Theorem which also suggests an approach to improving the parameters of the theorem

    On Differentially Private Counting on Trees

    Get PDF
    We study the problem of performing counting queries at different levels in hierarchical structures while preserving individuals\u27 privacy. Motivated by applications, we propose a new error measure for this problem by considering a combination of multiplicative and additive approximation to the query results. We examine known mechanisms in differential privacy (DP) and prove their optimality, under this measure, in the pure-DP setting. In the approximate-DP setting, we design new algorithms achieving significant improvements over known ones

    New oracle-efficient algorithms for private synthetic data release

    Full text link
    We present three new algorithms for constructing differentially private synthetic data—a sanitized version of a sensitive dataset that approximately preserves the answers to a large collection of statistical queries. All three algorithms are oracle-efficient in the sense that they are computationally efficient when given access to an optimization oracle. Such an oracle can be implemented using many existing (non-private) optimization tools such as sophisticated integer program solvers. While the accuracy of the synthetic data is contingent on the oracle’s optimization performance, the algorithms satisfy differential privacy even in the worst case. For all three algorithms, we provide theoretical guarantees for both accuracy and privacy. Through empirical evaluation, we demonstrate that our methods scale well with both the dimensionality of the data and the number of queries. Compared to the state-of-the-art method High-Dimensional Matrix Mechanism McKenna et al. (2018), our algorithms provide better accuracy in the large workload and high privacy regime (corresponding to low privacy loss ε).https://arxiv.org/pdf/2007.05453.pd
    corecore