To p, or not to p?: quantifying inferential decision errors to assess whether significance truly is significant

Abstract

Empirical testing is centred on p-values. These summary statistics are used to assess the plausibility of a null hypothesis, and therein lies a flaw in their interpretation. Central to this research is accounting for the behaviour of p-values, through density functions, under the alternative hypothesis, H1. These densities are determined by a combination of the sample size and parametric specification of H1. Here, several new contributions are presented to reflect p-value behaviour. By considering the likelihood of both hypotheses in parallel, it is possible to optimise the decision-making process. A framework for simultaneously testing the null and alternative hypotheses is outlined for various testing scenarios. To facilitate efficient empirical conclusions, a new set of critical value tables is presented requiring only the conventional p-value, hence avoiding the need for additional computation in order to apply this joint testing in practice. Simple and composite forms of H1 are considered. Recognising the conflict between different schools of thought with respect to hypothesis testing, a unified approach at consolidating the advantages of each is offered. Again, exploiting p-value distributions under various forms of H1, a revised conditioning statistic for conditional frequentist testing is developed from which original p-value curves and surfaces are produced to further ease decision making. Finally, attention turns to multiple hypothesis testing. Estimation of multiple testing error rates is discussed and a new estimator for the proportion of true null hypotheses, when simultaneously testing several independent hypotheses, is presented. Under certain conditions it is shown that this estimator is superior to an established estimator

Similar works

This paper was published in LSE Theses Online.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.