17 research outputs found
Tests of fit for the logarithmic distribution
Smooth tests for the logarithmic distribution are compared with three tests: the first is a test due to Epps and is based on a probability generating function, the second is the Anderson-Darling test, and the third is due to Klar and is based on the empirical integrated distribution function. These tests all have substantially better power than the traditional Pearson-Fisher X2 test of fit for the logarithmic. These traditional chi-squared tests are the only logarithmic tests of fit commonly applied by ecologists and other scientists
Sequence count data are poorly fit by the negative binomial distribution
Sequence count data are commonly modelled using the negative binomial (NB) distribution. Several empirical studies, however, have demonstrated that methods based on the NB-assumption do not always succeed in controlling the false discovery rate (FDR) at its nominal level. In this paper, we propose a dedicated statistical goodness of fit test for the NB distribution in regression models and demonstrate that the NB-assumption is violated in many publicly available RNA-Seq and 16S rRNA microbiome datasets. The zero-inflated NB distribution was not found to give a substantially better fit. We also show that the NB-based tests perform worse on the features for which the NB-assumption was violated than on the features for which no significant deviation was detected. This gives an explanation for the poor behaviour of NB-based tests in many published evaluation studies. We conclude that non-parametric tests should be preferred over parametric methods
More informative testing for bivariate symmetry
In testing for bivariate symmetry against arbitrary alternatives the well-known test developed by Bowker in 1948 is shown to be a score test, and to have useful components. These components are asymptotically independent and asymptotically have the standard normal distribution. Moreover they assess particular pairs of cells for symmetry. These components can also be used in a data analytic manner to complement a test for bivariate symmetry against ordered alternatives
A contingency table approach to nonparametric testing
Most texts on nonparametric techniques concentrate on location and linear-linear (correlation) tests, with less emphasis on dispersion effects and linear-quadratic tests. Tests for higher moment effects are virtually ignored. Using a fresh approach, A Contingency Table Approach to Nonparametric Testing unifies and extends the popular, standard tests by linking them to tests based on models for data that can be presented in contingency tables.This approach unifies popular nonparametric statistical inference and makes the traditional, most commonly performed nonparametric analyses much more com
Tests for symmetry based on the one-sample Wilcoxon signed rank statistic
The one-sample Wilcoxon signed rank test was originally designed to test for a specified median, under the assumption that the distribution is symmetric, but it can also serve as a test for symmetry if the median is known. In this article we derive the Wilcoxon statistic as the first component of Pearson's X-2 statistic for independence in a particularly constructed contingency table. The second and third components are new test statistics for symmetry. In the second part of the article, the Wilcoxon test is extended so that symmetry around the median and symmetry in the tails can be examined seperately. A trimming proportion is used to split the observations in the tails from those around the median. We further extend the method so that no arbitrary choice for the trimming proportion has to be made. Finally, the new tests are compared to other tests for symmetry in a simulation study. It is concluded that our tests often have substantially greater powers than most other tests
Dimensionality reduction methods for contingency tables with ordinal variables
Several extensions of correspondence analysis have been introduced in literature coping with the possible ordinal structure of the variables. They usually obtain a graphical representation of the interdependence between the rows and columns of a contingency table, by using several tools for the dimensionality reduction of the involved spaces. These tools are able to enrich the interpretation of the graphical planes, providing also additional information, with respect to the usual singular value decomposition. The main aim of this paper is to suggest an unified theoretical framework of several methods of correspondence analysis coping with ordinal variables