11,869 research outputs found

    Differentially Private Nonparametric Hypothesis Testing

    Full text link
    Hypothesis tests are a crucial statistical tool for data mining and are the workhorse of scientific research in many fields. Here we study differentially private tests of independence between a categorical and a continuous variable. We take as our starting point traditional nonparametric tests, which require no distributional assumption (e.g., normality) about the data distribution. We present private analogues of the Kruskal-Wallis, Mann-Whitney, and Wilcoxon signed-rank tests, as well as the parametric one-sample t-test. These tests use novel test statistics developed specifically for the private setting. We compare our tests to prior work, both on parametric and nonparametric tests. We find that in all cases our new nonparametric tests achieve large improvements in statistical power, even when the assumptions of parametric tests are met

    Differentially Private Release and Learning of Threshold Functions

    Full text link
    We prove new upper and lower bounds on the sample complexity of (ϵ,δ)(\epsilon, \delta) differentially private algorithms for releasing approximate answers to threshold functions. A threshold function cxc_x over a totally ordered domain XX evaluates to cx(y)=1c_x(y) = 1 if yxy \le x, and evaluates to 00 otherwise. We give the first nontrivial lower bound for releasing thresholds with (ϵ,δ)(\epsilon,\delta) differential privacy, showing that the task is impossible over an infinite domain XX, and moreover requires sample complexity nΩ(logX)n \ge \Omega(\log^*|X|), which grows with the size of the domain. Inspired by the techniques used to prove this lower bound, we give an algorithm for releasing thresholds with n2(1+o(1))logXn \le 2^{(1+ o(1))\log^*|X|} samples. This improves the previous best upper bound of 8(1+o(1))logX8^{(1 + o(1))\log^*|X|} (Beimel et al., RANDOM '13). Our sample complexity upper and lower bounds also apply to the tasks of learning distributions with respect to Kolmogorov distance and of properly PAC learning thresholds with differential privacy. The lower bound gives the first separation between the sample complexity of properly learning a concept class with (ϵ,δ)(\epsilon,\delta) differential privacy and learning without privacy. For properly learning thresholds in \ell dimensions, this lower bound extends to nΩ(logX)n \ge \Omega(\ell \cdot \log^*|X|). To obtain our results, we give reductions in both directions from releasing and properly learning thresholds and the simpler interior point problem. Given a database DD of elements from XX, the interior point problem asks for an element between the smallest and largest elements in DD. We introduce new recursive constructions for bounding the sample complexity of the interior point problem, as well as further reductions and techniques for proving impossibility results for other basic problems in differential privacy.Comment: 43 page

    Effect of sample size and P-value filtering techniques on the detection of transcriptional changes induced in rat neuroblastoma (NG108) cells by mefloquine

    Get PDF
    BACKGROUND: There is no known biochemical basis for the adverse neurological events attributed to mefloquine. Identification of genes modulated by toxic agents using microarrays may provide sufficient information to generate hypotheses regarding their mode of action. However, this utility may be compromised if sample sizes are too low or the filtering methods used to identify differentially expressed genes are inappropriate. METHODS: The transcriptional changes induced in rat neuroblastoma cells by a physiological dose of mefloquine (10 micro-molar) were investigated using Affymetrix arrays. A large sample size was used (total of 16 arrays). Genes were ranked by P-value (t-test). RT-PCR was used to confirm (or reject) the expression changes of several of the genes with the lowest P-values. Different P-value filtering methods were compared in terms of their ability to detect these differentially expressed genes. A retrospective power analysis was then performed to determine whether the use of lower sample sizes might also have detected those genes with altered transcription. RESULTS: Based on RT-PCR, mefloquine upregulated cJun, IkappaB and GADD153. Reverse Holm-Bonferroni P-value filtering was superior to other methods in terms of maximizing detection of differentially expressed genes but not those with unaltered expression. Reduction of total microarray sample size (< 10) impaired the capacity to detect differentially expressed genes. CONCLUSIONS: Adequate sample sizes and appropriate selection of P-value filtering methods are essential for the reliable detection of differentially expressed genes. The changes in gene expression induced by mefloquine suggest that the ER might be a neuronal target of the drug

    Systems genetic analysis of addiction-associated traits

    Full text link
    Substance abuse disorders are heritable neuropsychiatric disorders with largely unknown genetic etiology. Distinct genetic factors likely contribute to the different stages and behaviors of addiction, including initial sensitivity to the subjective and physiological effects of drugs and physiological and psychological measures of withdrawal. Mammalian model organisms permit a comprehensive approach to gene mapping and to bridging genetic variation with neurobiological mechanisms of addiction-relevant behaviors. The focus of this dissertation is to investigate the genetic basis of the rewarding and aversive properties of opioids, utilizing a systems genetics approach that includes both forward and reverse genetics in combination with transcriptomics and bioinformatics as tools to determine the molecular mechanisms. The first aim of this research is to conduct a genetic linkage mapping study of addiction-associated traits in a reduced complexity cross of two nearly identical B6 substrains (C57BL/6J and C57BL/6NJ). Forward genetic techniques, such as quantitative trait locus (QTL) mapping was utilized to identify novel candidate genes involved in addiction-associated traits. We completed QTL mapping combined with genome-wide gene expression analyses to rapidly identify compelling candidate genes underlying addiction traits. Most notably, we identified a region on distal chromosome 1 that regulates opioid sensitivity and withdrawal. Using striatal expression QTL mapping, transcript/behavior covariance, and convergent haplotype analysis, we identified a strong positional candidate gene, Rgs7. The second aim of this research is to validate novel candidate genes and molecular mechanisms responsible for modulation of opioid reward and aversion. Using behavioral and expression QTL mapping, Csnk1e was previously identified as a candidate gene for psychostimulant sensitivity. Here, we utilized Csnk1e knockout mice to confirm the effect of Csnk1e deletion on opioid sensitivity and extend its role to opioid reward and a natural reward dependent on opioid signaling- sweetened palatable food consumption. Additionally, we have utilized striatal transcriptome analyses to identify potential molecular mechanisms, including aberrant myelination and neurodevelopment of the striatum. In summary, this dissertation research utilizes mouse forward and reverse genetics, in combination with transcriptome and bioinformatics analyses to identify the genetic and neurobiological underpinnings of addiction-associated traits
    corecore