6,894 research outputs found

    Significance Tests Harm Progress in Forecasting

    Get PDF
    Based on a summary of prior literature, I conclude that tests of statistical significance harm scientific progress. Efforts to find exceptions to this conclusion have, to date, turned up none. Even when done correctly, significance tests are dangerous. I show that summaries of scientific research do not require tests of statistical significance. I illustrate the dangers of significance tests by examining an application to the M3-Competition. Although the authors of that reanalysis conducted a proper series of statistical tests, they suggest that the original M3 was not justified in concluding that combined forecasts reduce errors and that the selection of the best method is dependent upon the selection of a proper error measure. I show that the original conclusions were justified and that they are correct. Authors should try to avoid tests of statistical significance, journals should discourage them, and readers should ignore them. Instead, to analyze and communicate findings from empirical studies, one should use effect sizes, confidence intervals,replications/extensions, and meta-analyses

    Trustworthiness of statistical inference

    Get PDF
    We examine the role of trustworthiness and trust in statistical inference, arguing that it is the extent of trustworthiness in inferential statistical tools which enables trust in the conclusions. Certain tools, such as the p‐value and significance test, have recently come under renewed criticism, with some arguing that they damage trust in statistics. We argue the contrary, beginning from the position that the central role of these methods is to form the basis for trusted conclusions in the face of uncertainty in the data, and noting that it is the misuse and misunderstanding of these tools which damages trustworthiness and hence trust. We go on to argue that recent calls to ban these tools would tackle the symptom, not the cause, and themselves risk damaging the capability of science to advance, as well as risking feeding into public suspicion of the discipline of statistics. The consequence could be aggravated mistrust of our discipline and of science more generally. In short, the very proposals could work in quite the contrary direction from that intended. We make some alternative proposals for tackling the misuse and misunderstanding of these methods, and for how trust in our discipline might be promoted

    Researchers Should Make Thoughtful Assessments Instead of Null-Hypothesis Significance Tests

    Get PDF
    Null-hypothesis significance tests (NHSTs) have received much criticism, especially during the last two decades. Yet, many behavioral and social scientists are unaware that NHSTs have drawn increasing criticism, so this essay summarizes key criticisms. The essay also recommends alternative ways of assessing research findings. Although these recommendations are not complex, they do involve ways of thinking that many behavioral and social scientists find novel. Instead of making NHSTs, researchers should adapt their research assessments to specific contexts and specific research goals, and then explain their rationales for selecting assessment indicators. Researchers should show the substantive importance of findings by reporting effect sizes and should acknowledge uncertainty by stating confidence intervals. By comparing data with naïve hypotheses rather than with null hypotheses, researchers can challenge themselves to develop better theories. Parsimonious models are easier to understand and they generalize more reliably. Robust statistical methods tolerate deviations from assumptions about samples

    Testing Point Null Hypothesis of a Normal Mean and the Truth: 21st Century Perspective

    Get PDF
    Testing a point (sharp) null hypothesis is arguably the most widely used statistical inferential procedure in many fields of scientific research, nevertheless, the most controversial, and misapprehended. Since 1935 when Buchanan-Wollaston raised the first criticism against hypothesis testing, this foundational field of statistics has drawn increasingly active and stronger opposition, including draconian suggestions that statistical significance testing should be abandoned or even banned. Statisticians should stop ignoring these accumulated and significant anomalies within the current point-null hypotheses paradigm and rebuild healthy foundations of statistical science. The foundation for a paradigm shift in testing statistical hypotheses is suggested, which is testing interval null hypotheses based on implications of the Zero probability paradox. It states that in a real-world research point-null hypothesis of a normal mean has zero probability. This implies that formulated point-null hypothesis of a mean in the context of the simple normal model is almost surely false. Thus, Zero probability paradox points to the root cause of so-called large n problem in significance testing. It discloses that there is no point in searching for a cure under the current point-null paradigm

    Effect size, confidence intervals and statistical power in psychological research

    Full text link
    corecore