12 research outputs found

    Divisible Statistics and Their Partial Sum Processes: Asymptotic Properties and Applications

    No full text
    Divisible statistics have been widely used in many areas of statistical analysis. For example, Pearson's Chi-square statistic and the log-likelihood ratio statistic are frequently used in goodness of fit (GOF) and categorical analysis; the maximum likelihood (ML) estimators of the Shannon's and Simpson's diversity indices are often used as measure of diversity; and the spectral statistic plays a key role in the theory of large number of rare events. In the classical multinomial model, where the number of disjoint events N and their probabilities are all fixed, limit distributions of many divisible statistics have gradually been established. However, most of the results are based on the asymptotic equivalence of these statistics to Pearson's Chi-square statistic and the known limit distribution of the latter. In fact, with deeper analysis, one can conclude that the key point is not the asymptotic behavior of the Chi-square statistic, but that of the normalized frequencies. Based on the asymptotic normality of the normalized frequencies in the classical model, a unified approach to the limit theorems of more general divisible statistics can be established, of which the case of the Chi-square statistic is simply a natural corollary. In many applications, however, the classical multinomial model is not appropriate, and an extension to new models becomes necessary. This new type of model, called "non-classical" multinomial models, considers the case when N increases and the {Pni} change as sample size n increases. As we will see, in these non-classical models, both the asymptotic normality of the normalized frequencies and the asymptotic equivalence of many divisible statistics to the Chi-square statistic are lost, and the limit theorems established in classical model are no longer valid in non-classical models. The extension to non-classical models not only met the demands of many real world applications, but also opened a new research area in statistical analysis, which has not been thoroughly investigated so far. Although some results on the limit distributions of the divisible statistics in non-classical models have been acquired, e.g., Holst (1972); Morris (1975); Ivchenko and Levin (1976); Ivchenko and Medvedev (1979), they are far from complete. Though not yet attracting much attention by many applied statisticians, another advanced approach, introduced by Khmaladze (1984), makes use of modern martingale theory to establish functional limit theorems of the partial sum processes of divisible statistics successfully. In the main part of this thesis, we show that this martingale approach can be extended to more general situations where both Gaussian and Poissonian frequencies exist, and further discuss the properties and applications of the limiting processes, especially in constructing distribution-free statistics. The last part of the thesis is about the statistical analysis of large number of rare events (LNRE), which is an important class of non-classical multinomial models and presented in numerous applications. In LNRE models, most of the frequencies are very small and it is not immediately clear how consistent and reliable inference can be achieved. Based on the definitions and key concepts firstly introduced by Khmaladze (1988), we discuss a particular model with the context of diversity of questionnaires. The advanced statistical techniques such as large deviation, contiguity and Edgeworth expansion used in establishing limit theorems underpin the potential of LNRE theory to become a fruitful research area in future

    Comments on the infinitely divisibility of the Conway--Maxwell--Poisson distribution

    Full text link
    In an elegant recent paper \cite{geng2022conway}, Geng and Xia settled the question of the infinite divisibility of the Conway--Maxwell--Poisson distribution, using in large part several results from complex analysis. In this note we show how these complex analytic methods can be circumvented, thereby giving a proof of their result which is completely elementary

    Large Number of Rare Events: Diversity Analysis in Multiple Choice Questionnaires and Related Topics

    No full text
    The statistical analysis of a large number of rare events, (LNRE), which can also be called statistical theory of diversity, is the subject of acute interest both in statistical theory and in numerous applications. A careful eye will quickly see the presence of a large number of very rare objects almost everywhere: large numbers of rare species in ecosystems, large numbers of rare opinions in any opinion pool, large numbers of small admixtures in any solution and large numbers of rare words in any text are only few examples. In studying such objects, the interest for mathematical statisticians lies in the fact that most of the frequencies are small and, therefore, difficult to deal with. It is not immediately clear how one should be able to derive consistent and reliable inference from a large number of such frequencies. In this thesis we study the diversity of questionnaires with multiple answers. It has been demonstrated that this is a particular model of LNRE theory. In our analysis, the theories of large deviation, contiguity and Edgeworth expansion were employed, and limit theorems have been established

    Generalized Error Exponents For Small Sample Universal Hypothesis Testing

    Full text link
    The small sample universal hypothesis testing problem is investigated in this paper, in which the number of samples nn is smaller than the number of possible outcomes mm. The goal of this work is to find an appropriate criterion to analyze statistical tests in this setting. A suitable model for analysis is the high-dimensional model in which both nn and mm increase to infinity, and n=o(m)n=o(m). A new performance criterion based on large deviations analysis is proposed and it generalizes the classical error exponent applicable for large sample problems (in which m=O(n)m=O(n)). This generalized error exponent criterion provides insights that are not available from asymptotic consistency or central limit theorem analysis. The following results are established for the uniform null distribution: (i) The best achievable probability of error PeP_e decays as Pe=exp{(n2/m)J(1+o(1))}P_e=\exp\{-(n^2/m) J (1+o(1))\} for some J>0J>0. (ii) A class of tests based on separable statistics, including the coincidence-based test, attains the optimal generalized error exponents. (iii) Pearson's chi-square test has a zero generalized error exponent and thus its probability of error is asymptotically larger than the optimal test.Comment: 43 pages, 4 figure

    Asymptotic minimaxity of chi-square tests

    Get PDF
    We show that the sequence of chi-square tests is asymptotically minimax if a number of cells increases with increasing sample size. The proof utilizes Theorem about asymptotic normality of chi-square test statistics obtained under new compact assumptions

    Martingale limit theorems of divisible statistics in a multinomial scheme with mixed frequencies

    No full text
    The martingale approach to limit theorems of divisible statistics in non-classical multinomial schemes, established by Khmaladze in 1983, has shown great power for those models with all asymptotically Poissonian frequencies. We extended this approach to more general situations, which include both asymptotically Gaussian and Poissonian frequencies, and established functional limit theorems.Functional limit theorems Divisible statistics Multinomial scheme Mixed frequencies
    corecore