518 research outputs found
Exact Nonparametric Two-Sample Homogeneity Tests for Possibly Discrete Distributions
In this paper, we study several tests for the equality of two unknown distributions. Two are based on empirical distribution functions, three others on nonparametric probability density estimates, and the last ones on differences between sample moments. We suggest controlling the size of such tests (under nonparametric assumptions) by using permutational versions of the tests jointly with the method of Monte Carlo tests properly adjusted to deal with discrete distributions. We also propose a combined test procedure, whose level is again perfectly controlled through the Monte Carlo test technique and has better power properties than the individual tests which are combined. Finally, in a simulation experiment, we show that the technique suggested provides perfect control of test size and that the new tests proposed can yield sizeable power improvements. Dans ce texte, nous étudions plusieurs tests pour l'egalité de deux distributions inconnues. Deux de ces tests sont basés sur des fonctions de distribution empiriques, trois autres sur des estimateurs non-paramétriques de fonctions de densité, et les trois derniers sur des moments empiriques. Nous proposons de contrôler la taille des tests (sous des hypothèses non-paramétriques) en employant des versions permutationnelles de ces tests conjointement avec la méthode des tests de Monte Carlo ajustée pour tenir compte de la possibilité de distributions discontinues. Nous proposons aussi une méthode pour combiner plusieurs de ces tests, le niveau de ces procédures étant aussi contrôlé par la technique des tests de Monte Carlo, laquelle possède de meilleures propriétés de puissance que les tests individuels combinés. Finalement, nous montrons dans une étude de simulation que la technique suggérée contrôle parfaitement la taille des différents tests considérés et que les nouveaux tests proposés peuvent fournir de notables améliorations de puissance.Nonparametric methods, two-sample problem, discrete distribution, discontinuous distribution, goodness-of-fit test, Kolmogorov-Smirnov test, Cramér-von Mises, kernel density estimator, exact test, permutation test, Monte Carlo test, bootstrap, combined test procedure, induced test, Méthodes non-paramétriques, problème des deux échantillons, distribution discrète, distribution discontinue, test d'ajustement, test de Kolmogorov-Smirnov, estimateur à noyau pour une densité, test exact, test de permutations, test de Monte Carlo, bootstrap, test combiné, test induit
Goodness-of-fit test for discrete and censored data, based on the empirical distribution function
In this thesis two general problems concerning goodness-of- fit statistics based on the empirical distribution are considered. The first concerns the problem of adapting Kolmogorov-Smirnov type statistics to test for discrete populations. The significance points of the statistics are given and various power comparisons made.
The second problem concerns testing for goodness-of-fit with censored data using the Cramér-von Mises type statistics. The small and large sample distributions are given and the tests are modified so that they can be used to test for the normal and the exponential distributions. The asymptotic theory is developed. Percentage points for the statistics are given and various small sample and large sample power studies are made, for the various cases
Maximum Fidelity
The most fundamental problem in statistics is the inference of an unknown
probability distribution from a finite number of samples. For a specific
observed data set, answers to the following questions would be desirable: (1)
Estimation: Which candidate distribution provides the best fit to the observed
data?, (2) Goodness-of-fit: How concordant is this distribution with the
observed data?, and (3) Uncertainty: How concordant are other candidate
distributions with the observed data? A simple unified approach for univariate
data that addresses these traditionally distinct statistical notions is
presented called "maximum fidelity". Maximum fidelity is a strict frequentist
approach that is fundamentally based on model concordance with the observed
data. The fidelity statistic is a general information measure based on the
coordinate-independent cumulative distribution and critical yet previously
neglected symmetry considerations. An approximation for the null distribution
of the fidelity allows its direct conversion to absolute model concordance (p
value). Fidelity maximization allows identification of the most concordant
model distribution, generating a method for parameter estimation, with
neighboring, less concordant distributions providing the "uncertainty" in this
estimate. Maximum fidelity provides an optimal approach for parameter
estimation (superior to maximum likelihood) and a generally optimal approach
for goodness-of-fit assessment of arbitrary models applied to univariate data.
Extensions to binary data, binned data, multidimensional data, and classical
parametric and nonparametric statistical tests are described. Maximum fidelity
provides a philosophically consistent, robust, and seemingly optimal foundation
for statistical inference. All findings are presented in an elementary way to
be immediately accessible to all researchers utilizing statistical analysis.Comment: 66 pages, 32 figures, 7 tables, submitte
Exact Nonparametric Two-Sample Homogeneity Tests for Possibly Discrete Distributions
In this paper, we study several tests for the equality of two unknown distributions. Two are based on empirical distribution functions, three others on nonparametric probability density estimates, and the last ones on differences between sample moments. We suggest controlling the size of such tests (under nonparametric assumptions) by using permutational versions of the tests jointly with the method of Monte Carlo tests properly adjusted to deal with discrete distributions. We also propose a combined test procedure, whose level is again perfectly controlled through the Monte Carlo test technique and has better power properties than the individual tests that are combined. Finally, in a simulation experiment, we show that the technique suggested provides perfect control of test size and that the new tests proposed can yield sizeable power improvements.Dans ce texte, nous étudions plusieurs tests pour l’égalité de deux distributions inconnues. Deux de ces tests sont basés sur des fonctions de distribution empiriques, trois autres sur des estimateurs non paramétriques de fonctions de densité et les trois derniers sur des moments empiriques. Nous proposons de contrôler la taille des tests (sous des hypothèses non paramétriques) en employant des versions permutationnelles de ces tests conjointement avec la méthode des tests de Monte Carlo ajustée pour tenir compte de la possibilité de distributions discontinues. Nous proposons aussi une méthode pour combiner plusieurs de ces tests, le niveau de ces procédures étant aussi contrôlé par la technique des tests de Monte Carlo, laquelle possède de meilleures propriétés de puissance que les tests individuels combinés. Finalement, nous montrons dans une étude de simulation que la technique suggérée contrôle parfaitement la taille des différents tests considérés et que les nouveaux tests proposés peuvent fournir de notables améliorations de puissance
Goodness-of-fit test for discrete and censored data, based on the empirical distribution function
In this thesis two general problems concerning goodness-of- fit statistics based on the empirical distribution are considered. The first concerns the problem of adapting Kolmogorov-Smirnov type statistics to test for discrete populations. The significance points of the statistics are given and various power comparisons made.
The second problem concerns testing for goodness-of-fit with censored data using the Cramér-von Mises type statistics. The small and large sample distributions are given and the tests are modified so that they can be used to test for the normal and the exponential distributions. The asymptotic theory is developed. Percentage points for the statistics are given and various small sample and large sample power studies are made, for the various cases
- …