Search CORE

17 research outputs found

Not All Learnable Distribution Classes are Privately Learnable

Author: Bun Mark
Kamath Gautam
Mouzakis Argyris
Singhal Vikrant
Publication venue
Publication date: 05/02/2024
Field of study

We give an example of a class of distributions that is learnable in total variation distance with a finite number of samples, but not learnable under

(\varepsilon, \delta)

-differential privacy. This refutes a conjecture of Ashtiani.Comment: To appear in ALT 2024. Added a minor clarification to the construction and an acknowledgement of the Fields Institut

arXiv.org e-Print Archive

A Polynomial Time, Pure Differentially Private Estimator for Binary Product Distributions

Author: Singhal Vikrant
Publication venue
Publication date: 13/04/2023
Field of study

We present the first

\varepsilon

-differentially private, computationally efficient algorithm that estimates the means of product distributions over

\{0,1\}^d

accurately in total-variation distance, whilst attaining the optimal sample complexity to within polylogarithmic factors. The prior work had either solved this problem efficiently and optimally under weaker notions of privacy, or had solved it optimally while having exponential running times

arXiv.org e-Print Archive

Better and Simpler Lower Bounds for Differentially Private Statistical Estimation

Author: Narayanan Shyam
Publication venue
Publication date: 10/10/2023
Field of study

We provide improved lower bounds for two well-known high-dimensional private estimation tasks. First, we prove that for estimating the covariance of a Gaussian up to spectral error

\alpha

with approximate differential privacy, one needs

\tilde{\Omega}\left(\frac{d^{3/2}}{\alpha \varepsilon} + \frac{d}{\alpha^2}\right)

samples for any

\alpha \le O(1)

, which is tight up to logarithmic factors. This improves over previous work which established this for

\alpha \le O\left(\frac{1}{\sqrt{d}}\right)

, and is also simpler than previous work. Next, we prove that for estimating the mean of a heavy-tailed distribution with bounded

k

th moments with approximate differential privacy, one needs

\tilde{\Omega}\left(\frac{d}{\alpha^{k/(k-1)} \varepsilon} + \frac{d}{\alpha^2}\right)

samples. This matches known upper bounds and improves over the best known lower bound for this problem, which only hold for pure differential privacy, or when

k = 2

. Our techniques follow the method of fingerprinting and are generally quite simple. Our lower bound for heavy-tailed estimation is based on a black-box reduction from privately estimating identity-covariance Gaussians. Our lower bound for covariance estimation utilizes a Bayesian approach to show that, under an Inverse Wishart prior distribution for the covariance matrix, no private estimator can be accurate even in expectation, without sufficiently many samples.Comment: 23 page

arXiv.org e-Print Archive

Privacy Preserving Adaptive Experiment Design

Author: Li Jiachun
Shi Kaining
Simchi-Levi David
Publication venue
Publication date: 05/02/2024
Field of study

Adaptive experiment is widely adopted to estimate conditional average treatment effect (CATE) in clinical trials and many other scenarios. While the primary goal in experiment is to maximize estimation accuracy, due to the imperative of social welfare, it's also crucial to provide treatment with superior outcomes to patients, which is measured by regret in contextual bandit framework. These two objectives often lead to contrast optimal allocation mechanism. Furthermore, privacy concerns arise in clinical scenarios containing sensitive data like patients health records. Therefore, it's essential for the treatment allocation mechanism to incorporate robust privacy protection measures. In this paper, we investigate the tradeoff between loss of social welfare and statistical power in contextual bandit experiment. We propose a matched upper and lower bound for the multi-objective optimization problem, and then adopt the concept of Pareto optimality to mathematically characterize the optimality condition. Furthermore, we propose differentially private algorithms which still matches the lower bound, showing that privacy is "almost free". Additionally, we derive the asymptotic normality of the estimator, which is essential in statistical inference and hypothesis testing.Comment: Add a tabl

arXiv.org e-Print Archive

CoinPress: Practical Private Mean and Covariance Estimation

Author: Biswas Sourav
Dong Yihe
Kamath Gautam
Ullman Jonathan
Publication venue
Publication date: 11/06/2020
Field of study

We present simple differentially private estimators for the mean and covariance of multivariate sub-Gaussian data that are accurate at small sample sizes. We demonstrate the effectiveness of our algorithms both theoretically and empirically using synthetic and real-world datasets---showing that their asymptotic error rates match the state-of-the-art theoretical bounds, and that they concretely outperform all previous methods. Specifically, previous estimators either have weak empirical accuracy at small sample sizes, perform poorly for multivariate data, or require the user to provide strong a priori estimates for the parameters.Comment: Code is available at https://github.com/twistedcubic/coin-pres

arXiv.org e-Print Archive