17 research outputs found

    Not All Learnable Distribution Classes are Privately Learnable

    Full text link
    We give an example of a class of distributions that is learnable in total variation distance with a finite number of samples, but not learnable under (ε,δ)(\varepsilon, \delta)-differential privacy. This refutes a conjecture of Ashtiani.Comment: To appear in ALT 2024. Added a minor clarification to the construction and an acknowledgement of the Fields Institut

    A Polynomial Time, Pure Differentially Private Estimator for Binary Product Distributions

    Full text link
    We present the first ε\varepsilon-differentially private, computationally efficient algorithm that estimates the means of product distributions over {0,1}d\{0,1\}^d accurately in total-variation distance, whilst attaining the optimal sample complexity to within polylogarithmic factors. The prior work had either solved this problem efficiently and optimally under weaker notions of privacy, or had solved it optimally while having exponential running times

    Better and Simpler Lower Bounds for Differentially Private Statistical Estimation

    Full text link
    We provide improved lower bounds for two well-known high-dimensional private estimation tasks. First, we prove that for estimating the covariance of a Gaussian up to spectral error α\alpha with approximate differential privacy, one needs Ω~(d3/2αε+dα2)\tilde{\Omega}\left(\frac{d^{3/2}}{\alpha \varepsilon} + \frac{d}{\alpha^2}\right) samples for any α≤O(1)\alpha \le O(1), which is tight up to logarithmic factors. This improves over previous work which established this for α≤O(1d)\alpha \le O\left(\frac{1}{\sqrt{d}}\right), and is also simpler than previous work. Next, we prove that for estimating the mean of a heavy-tailed distribution with bounded kkth moments with approximate differential privacy, one needs Ω~(dαk/(k−1)ε+dα2)\tilde{\Omega}\left(\frac{d}{\alpha^{k/(k-1)} \varepsilon} + \frac{d}{\alpha^2}\right) samples. This matches known upper bounds and improves over the best known lower bound for this problem, which only hold for pure differential privacy, or when k=2k = 2. Our techniques follow the method of fingerprinting and are generally quite simple. Our lower bound for heavy-tailed estimation is based on a black-box reduction from privately estimating identity-covariance Gaussians. Our lower bound for covariance estimation utilizes a Bayesian approach to show that, under an Inverse Wishart prior distribution for the covariance matrix, no private estimator can be accurate even in expectation, without sufficiently many samples.Comment: 23 page

    Privacy Preserving Adaptive Experiment Design

    Full text link
    Adaptive experiment is widely adopted to estimate conditional average treatment effect (CATE) in clinical trials and many other scenarios. While the primary goal in experiment is to maximize estimation accuracy, due to the imperative of social welfare, it's also crucial to provide treatment with superior outcomes to patients, which is measured by regret in contextual bandit framework. These two objectives often lead to contrast optimal allocation mechanism. Furthermore, privacy concerns arise in clinical scenarios containing sensitive data like patients health records. Therefore, it's essential for the treatment allocation mechanism to incorporate robust privacy protection measures. In this paper, we investigate the tradeoff between loss of social welfare and statistical power in contextual bandit experiment. We propose a matched upper and lower bound for the multi-objective optimization problem, and then adopt the concept of Pareto optimality to mathematically characterize the optimality condition. Furthermore, we propose differentially private algorithms which still matches the lower bound, showing that privacy is "almost free". Additionally, we derive the asymptotic normality of the estimator, which is essential in statistical inference and hypothesis testing.Comment: Add a tabl

    CoinPress: Practical Private Mean and Covariance Estimation

    Full text link
    We present simple differentially private estimators for the mean and covariance of multivariate sub-Gaussian data that are accurate at small sample sizes. We demonstrate the effectiveness of our algorithms both theoretically and empirically using synthetic and real-world datasets---showing that their asymptotic error rates match the state-of-the-art theoretical bounds, and that they concretely outperform all previous methods. Specifically, previous estimators either have weak empirical accuracy at small sample sizes, perform poorly for multivariate data, or require the user to provide strong a priori estimates for the parameters.Comment: Code is available at https://github.com/twistedcubic/coin-pres
    corecore