2,275,815 research outputs found

    Abandon Statistical Significance

    Get PDF
    We discuss problems the null hypothesis significance testing (NHST) paradigm poses for replication and more broadly in the biomedical and social sciences as well as how these problems remain unresolved by proposals involving modified p-value thresholds, confidence intervals, and Bayes factors. We then discuss our own proposal, which is to abandon statistical significance. We recommend dropping the NHST paradigm--and the p-value thresholds intrinsic to it--as the default statistical paradigm for research, publication, and discovery in the biomedical and social sciences. Specifically, we propose that the p-value be demoted from its threshold screening role and instead, treated continuously, be considered along with currently subordinate factors (e.g., related prior evidence, plausibility of mechanism, study design and data quality, real world costs and benefits, novelty of finding, and other factors that vary by research domain) as just one among many pieces of evidence. We have no desire to "ban" p-values or other purely statistical measures. Rather, we believe that such measures should not be thresholded and that, thresholded or not, they should not take priority over the currently subordinate factors. We also argue that it seldom makes sense to calibrate evidence as a function of p-values or other purely statistical measures. We offer recommendations for how our proposal can be implemented in the scientific publication process as well as in statistical decision making more broadly

    Assessing statistical significance of periodogram peaks

    Full text link
    The least-squares (or Lomb-Scargle) periodogram is a powerful tool which is used routinely in many branches of astronomy to search for periodicities in observational data. The problem of assessing statistical significance of candidate periodicities for different periodograms is considered. Based on results in extreme value theory, improved analytic estimations of false alarm probabilities are given. They include an upper limit to the false alarm probability (or a lower limit to the significance). These estimations are tested numerically in order to establish regions of their practical applicability.Comment: 7 pages, 6 figures, 1 table; To be published in MNRA

    Statistical significance of communities in networks

    Full text link
    Nodes in real-world networks are usually organized in local modules. These groups, called communities, are intuitively defined as sub-graphs with a larger density of internal connections than of external links. In this work, we introduce a new measure aimed at quantifying the statistical significance of single communities. Extreme and Order Statistics are used to predict the statistics associated with individual clusters in random graphs. These distributions allows us to define one community significance as the probability that a generic clustering algorithm finds such a group in a random graph. The method is successfully applied in the case of real-world networks for the evaluation of the significance of their communities.Comment: 9 pages, 8 figures, 2 tables. The software to calculate the C-score can be found at http://filrad.homelinux.org/cscor

    Statistical Significance of the Netflix Challenge

    Full text link
    Inspired by the legacy of the Netflix contest, we provide an overview of what has been learned---from our own efforts, and those of others---concerning the problems of collaborative filtering and recommender systems. The data set consists of about 100 million movie ratings (from 1 to 5 stars) involving some 480 thousand users and some 18 thousand movies; the associated ratings matrix is about 99% sparse. The goal is to predict ratings that users will give to movies; systems which can do this accurately have significant commercial applications, particularly on the world wide web. We discuss, in some detail, approaches to "baseline" modeling, singular value decomposition (SVD), as well as kNN (nearest neighbor) and neural network models; temporal effects, cross-validation issues, ensemble methods and other considerations are discussed as well. We compare existing models in a search for new models, and also discuss the mission-critical issues of penalization and parameter shrinkage which arise when the dimensions of a parameter space reaches into the millions. Although much work on such problems has been carried out by the computer science and machine learning communities, our goal here is to address a statistical audience, and to provide a primarily statistical treatment of the lessons that have been learned from this remarkable set of data.Comment: Published in at http://dx.doi.org/10.1214/11-STS368 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    The Cult of Statistical Significance

    Get PDF
    This article takes issue with a recent book by Ziliak and McCloskey (2008) of the same title. Ziliak and McCloskey argue that statistical significance testing is a barrier rather than a booster for empirical research in many fields and should therefore be abandoned altogether. The present article argues that this is good advice in some research areas but not in others. Taking all issues which have appeared so far of the German Economic Review and a recent epidemiological meta-analysis as examples, it shows that there has indeed been a lot of misleading work in the context of significance testing, and that at the same time many promising avenues for fruitfully employing statistical significance tests, disregarded by Ziliak and McCloskey, have not been used.significance testing

    Statistical significance of variables driving systematic variation

    Full text link
    There are a number of well-established methods such as principal components analysis (PCA) for automatically capturing systematic variation due to latent variables in large-scale genomic data. PCA and related methods may directly provide a quantitative characterization of a complex biological variable that is otherwise difficult to precisely define or model. An unsolved problem in this context is how to systematically identify the genomic variables that are drivers of systematic variation captured by PCA. Principal components (and other estimates of systematic variation) are directly constructed from the genomic variables themselves, making measures of statistical significance artificially inflated when using conventional methods due to over-fitting. We introduce a new approach called the jackstraw that allows one to accurately identify genomic variables that are statistically significantly associated with any subset or linear combination of principal components (PCs). The proposed method can greatly simplify complex significance testing problems encountered in genomics and can be utilized to identify the genomic variables significantly associated with latent variables. Using simulation, we demonstrate that our method attains accurate measures of statistical significance over a range of relevant scenarios. We consider yeast cell-cycle gene expression data, and show that the proposed method can be used to straightforwardly identify statistically significant genes that are cell-cycle regulated. We also analyze gene expression data from post-trauma patients, allowing the gene expression data to provide a molecularly-driven phenotype. We find a greater enrichment for inflammatory-related gene sets compared to using a clinically defined phenotype. The proposed method provides a useful bridge between large-scale quantifications of systematic variation and gene-level significance analyses.Comment: 35 pages, 1 table, 6 main figures, 7 supplementary figure
    • 

    corecore