518 research outputs found

    Exact Nonparametric Two-Sample Homogeneity Tests for Possibly Discrete Distributions

    Get PDF
    In this paper, we study several tests for the equality of two unknown distributions. Two are based on empirical distribution functions, three others on nonparametric probability density estimates, and the last ones on differences between sample moments. We suggest controlling the size of such tests (under nonparametric assumptions) by using permutational versions of the tests jointly with the method of Monte Carlo tests properly adjusted to deal with discrete distributions. We also propose a combined test procedure, whose level is again perfectly controlled through the Monte Carlo test technique and has better power properties than the individual tests which are combined. Finally, in a simulation experiment, we show that the technique suggested provides perfect control of test size and that the new tests proposed can yield sizeable power improvements. Dans ce texte, nous étudions plusieurs tests pour l'egalité de deux distributions inconnues. Deux de ces tests sont basés sur des fonctions de distribution empiriques, trois autres sur des estimateurs non-paramétriques de fonctions de densité, et les trois derniers sur des moments empiriques. Nous proposons de contrôler la taille des tests (sous des hypothèses non-paramétriques) en employant des versions permutationnelles de ces tests conjointement avec la méthode des tests de Monte Carlo ajustée pour tenir compte de la possibilité de distributions discontinues. Nous proposons aussi une méthode pour combiner plusieurs de ces tests, le niveau de ces procédures étant aussi contrôlé par la technique des tests de Monte Carlo, laquelle possède de meilleures propriétés de puissance que les tests individuels combinés. Finalement, nous montrons dans une étude de simulation que la technique suggérée contrôle parfaitement la taille des différents tests considérés et que les nouveaux tests proposés peuvent fournir de notables améliorations de puissance.Nonparametric methods, two-sample problem, discrete distribution, discontinuous distribution, goodness-of-fit test, Kolmogorov-Smirnov test, Cramér-von Mises, kernel density estimator, exact test, permutation test, Monte Carlo test, bootstrap, combined test procedure, induced test, Méthodes non-paramétriques, problème des deux échantillons, distribution discrète, distribution discontinue, test d'ajustement, test de Kolmogorov-Smirnov, estimateur à noyau pour une densité, test exact, test de permutations, test de Monte Carlo, bootstrap, test combiné, test induit

    Goodness-of-fit test for discrete and censored data, based on the empirical distribution function

    Get PDF
    In this thesis two general problems concerning goodness-of- fit statistics based on the empirical distribution are considered. The first concerns the problem of adapting Kolmogorov-Smirnov type statistics to test for discrete populations. The significance points of the statistics are given and various power comparisons made. The second problem concerns testing for goodness-of-fit with censored data using the Cramér-von Mises type statistics. The small and large sample distributions are given and the tests are modified so that they can be used to test for the normal and the exponential distributions. The asymptotic theory is developed. Percentage points for the statistics are given and various small sample and large sample power studies are made, for the various cases

    Maximum Fidelity

    Full text link
    The most fundamental problem in statistics is the inference of an unknown probability distribution from a finite number of samples. For a specific observed data set, answers to the following questions would be desirable: (1) Estimation: Which candidate distribution provides the best fit to the observed data?, (2) Goodness-of-fit: How concordant is this distribution with the observed data?, and (3) Uncertainty: How concordant are other candidate distributions with the observed data? A simple unified approach for univariate data that addresses these traditionally distinct statistical notions is presented called "maximum fidelity". Maximum fidelity is a strict frequentist approach that is fundamentally based on model concordance with the observed data. The fidelity statistic is a general information measure based on the coordinate-independent cumulative distribution and critical yet previously neglected symmetry considerations. An approximation for the null distribution of the fidelity allows its direct conversion to absolute model concordance (p value). Fidelity maximization allows identification of the most concordant model distribution, generating a method for parameter estimation, with neighboring, less concordant distributions providing the "uncertainty" in this estimate. Maximum fidelity provides an optimal approach for parameter estimation (superior to maximum likelihood) and a generally optimal approach for goodness-of-fit assessment of arbitrary models applied to univariate data. Extensions to binary data, binned data, multidimensional data, and classical parametric and nonparametric statistical tests are described. Maximum fidelity provides a philosophically consistent, robust, and seemingly optimal foundation for statistical inference. All findings are presented in an elementary way to be immediately accessible to all researchers utilizing statistical analysis.Comment: 66 pages, 32 figures, 7 tables, submitte

    Exact Nonparametric Two-Sample Homogeneity Tests for Possibly Discrete Distributions

    Get PDF
    In this paper, we study several tests for the equality of two unknown distributions. Two are based on empirical distribution functions, three others on nonparametric probability density estimates, and the last ones on differences between sample moments. We suggest controlling the size of such tests (under nonparametric assumptions) by using permutational versions of the tests jointly with the method of Monte Carlo tests properly adjusted to deal with discrete distributions. We also propose a combined test procedure, whose level is again perfectly controlled through the Monte Carlo test technique and has better power properties than the individual tests that are combined. Finally, in a simulation experiment, we show that the technique suggested provides perfect control of test size and that the new tests proposed can yield sizeable power improvements.Dans ce texte, nous étudions plusieurs tests pour l’égalité de deux distributions inconnues. Deux de ces tests sont basés sur des fonctions de distribution empiriques, trois autres sur des estimateurs non paramétriques de fonctions de densité et les trois derniers sur des moments empiriques. Nous proposons de contrôler la taille des tests (sous des hypothèses non paramétriques) en employant des versions permutationnelles de ces tests conjointement avec la méthode des tests de Monte Carlo ajustée pour tenir compte de la possibilité de distributions discontinues. Nous proposons aussi une méthode pour combiner plusieurs de ces tests, le niveau de ces procédures étant aussi contrôlé par la technique des tests de Monte Carlo, laquelle possède de meilleures propriétés de puissance que les tests individuels combinés. Finalement, nous montrons dans une étude de simulation que la technique suggérée contrôle parfaitement la taille des différents tests considérés et que les nouveaux tests proposés peuvent fournir de notables améliorations de puissance

    Goodness-of-fit test for discrete and censored data, based on the empirical distribution function

    Get PDF
    In this thesis two general problems concerning goodness-of- fit statistics based on the empirical distribution are considered. The first concerns the problem of adapting Kolmogorov-Smirnov type statistics to test for discrete populations. The significance points of the statistics are given and various power comparisons made. The second problem concerns testing for goodness-of-fit with censored data using the Cramér-von Mises type statistics. The small and large sample distributions are given and the tests are modified so that they can be used to test for the normal and the exponential distributions. The asymptotic theory is developed. Percentage points for the statistics are given and various small sample and large sample power studies are made, for the various cases
    • …
    corecore