31,049 research outputs found

    Direction-Projection-Permutation for High Dimensional Hypothesis Tests

    Full text link
    Motivated by the prevalence of high dimensional low sample size datasets in modern statistical applications, we propose a general nonparametric framework, Direction-Projection-Permutation (DiProPerm), for testing high dimensional hypotheses. The method is aimed at rigorous testing of whether lower dimensional visual differences are statistically significant. Theoretical analysis under the non-classical asymptotic regime of dimension going to infinity for fixed sample size reveals that certain natural variations of DiProPerm can have very different behaviors. An empirical power study both confirms the theoretical results and suggests DiProPerm is a powerful test in many settings. Finally DiProPerm is applied to a high dimensional gene expression dataset

    A statistical framework for testing functional categories in microarray data

    Get PDF
    Ready access to emerging databases of gene annotation and functional pathways has shifted assessments of differential expression in DNA microarray studies from single genes to groups of genes with shared biological function. This paper takes a critical look at existing methods for assessing the differential expression of a group of genes (functional category), and provides some suggestions for improved performance. We begin by presenting a general framework, in which the set of genes in a functional category is compared to the complementary set of genes on the array. The framework includes tests for overrepresentation of a category within a list of significant genes, and methods that consider continuous measures of differential expression. Existing tests are divided into two classes. Class 1 tests assume gene-specific measures of differential expression are independent, despite overwhelming evidence of positive correlation. Analytic and simulated results are presented that demonstrate Class 1 tests are strongly anti-conservative in practice. Class 2 tests account for gene correlation, typically through array permutation that by construction has proper Type I error control for the induced null. However, both Class 1 and Class 2 tests use a null hypothesis that all genes have the same degree of differential expression. We introduce a more sensible and general (Class 3) null under which the profile of differential expression is the same within the category and complement. Under this broader null, Class 2 tests are shown to be conservative. We propose standard bootstrap methods for testing against the Class 3 null and demonstrate they provide valid Type I error control and more power than array permutation in simulated datasets and real microarray experiments.Comment: Published in at http://dx.doi.org/10.1214/07-AOAS146 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    On the distribution of some Euler-Mahonian statistics

    Full text link
    We give a direct combinatorial proof of the equidistribution of two pairs of permutation statistics, (des, aid) and (lec, inv), which have been previously shown to have the same joint distribution as (exc, maj), the major index and the number of excedances of a permutation. Moreover, the triple (pix, lec, inv) was shown to have the same distribution as (fix, exc, maj), where fix is the number of fixed points of a permutation. We define a new statistic aix so that our bijection maps (pix, lec, inv) to (aix, des, aid). We also find an Eulerian partner das for a Mahonian statistic mix defined using mesh patterns, so that (das, mix) is equidistributed with (des, inv).Comment: 9 page
    • …
    corecore