2,046,370 research outputs found

    Significance testing without truth

    Full text link
    A popular approach to significance testing proposes to decide whether the given hypothesized statistical model is likely to be true (or false). Statistical decision theory provides a basis for this approach by requiring every significance test to make a decision about the truth of the hypothesis/model under consideration. Unfortunately, many interesting and useful models are obviously false (that is, not exactly true) even before considering any data. Fortunately, in practice a significance test need only gauge the consistency (or inconsistency) of the observed data with the assumed hypothesis/model -- without enquiring as to whether the assumption is likely to be true (or false), or whether some alternative is likely to be true (or false). In this practical formulation, a significance test rejects a hypothesis/model only if the observed data is highly improbable when calculating the probability while assuming the hypothesis being tested; the significance test only gauges whether the observed data likely invalidates the assumed hypothesis, and cannot decide that the assumption -- however unmistakably false -- is likely to be false a priori, without any data.Comment: 9 page

    On testing the significance of sets of genes

    Full text link
    This paper discusses the problem of identifying differentially expressed groups of genes from a microarray experiment. The groups of genes are externally defined, for example, sets of gene pathways derived from biological databases. Our starting point is the interesting Gene Set Enrichment Analysis (GSEA) procedure of Subramanian et al. [Proc. Natl. Acad. Sci. USA 102 (2005) 15545--15550]. We study the problem in some generality and propose two potential improvements to GSEA: the maxmean statistic for summarizing gene-sets, and restandardization for more accurate inferences. We discuss a variety of examples and extensions, including the use of gene-set scores for class predictions. We also describe a new R language package GSA that implements our ideas.Comment: Published at http://dx.doi.org/10.1214/07-AOAS101 in the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Testing the significance of calendar effects

    Get PDF
    This paper studies tests of calendar effects in equity returns. It is necessary to control for all possible calendar effects to avoid spurious results. The authors contribute to the calendar effects literature and its significance with a test for calendar-specific anomalies that conditions on the nuisance of possible calendar effects. Thus, their approach to test for calendar effects produces robust data-mining results. Unfortunately, attempts to control for a large number of possible calendar effects have the downside of diminishing the power of the test, making it more difficult to detect actual anomalies. The authors show that our test achieves good power properties because it exploits the correlation structure of (excess) returns specific to the calendar effect being studied. We implement the test with bootstrap methods and apply it to stock indices from Denmark, France, Germany, Hong Kong, Italy, Japan, Norway, Sweden, the United Kingdom, and the United States. Bootstrap p-values reveal that calendar effects are significant for returns in most of these equity markets, but end-of-the-year effects are predominant. It also appears that, beginning in the late 1980s, calendar effects have diminished except in small-cap stock indices.

    Significance testing in quantile regression

    Get PDF
    We consider the problem of testing significance of predictors in multivariate nonparametric quantile regression. A stochastic process is proposed, which is based on a comparison of the responses with a nonparametric quantile regression estimate under the null hypothesis. It is demonstrated that under the null hypothesis this process converges weakly to a centered Gaussian process and the asymptotic properties of the test under fixed and local alternatives are also discussed. In particular we show, that - in contrast to the nonparametric approach based on estimation of L2L^2-distances - the new test is able to detect local alternatives which converge to the null hypothesis with any rate an0a_n \to 0 such that anna_n \sqrt{n} \to \infty (here nn denotes the sample size). We also present a small simulation study illustrating the finite sample properties of a bootstrap version of the the corresponding Kolmogorov-Smirnov test

    Genome-Wide Significance Levels and Weighted Hypothesis Testing

    Full text link
    Genetic investigations often involve the testing of vast numbers of related hypotheses simultaneously. To control the overall error rate, a substantial penalty is required, making it difficult to detect signals of moderate strength. To improve the power in this setting, a number of authors have considered using weighted pp-values, with the motivation often based upon the scientific plausibility of the hypotheses. We review this literature, derive optimal weights and show that the power is remarkably robust to misspecification of these weights. We consider two methods for choosing weights in practice. The first, external weighting, is based on prior information. The second, estimated weighting, uses the data to choose weights.Comment: Published in at http://dx.doi.org/10.1214/09-STS289 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org
    corecore