12 research outputs found
Divisible Statistics and Their Partial Sum Processes: Asymptotic Properties and Applications
Divisible statistics have been widely used in many areas of statistical analysis. For example, Pearson's Chi-square statistic and the log-likelihood ratio statistic are frequently used in goodness of fit (GOF) and categorical analysis; the maximum likelihood (ML) estimators of the Shannon's and Simpson's diversity indices are often used as measure of diversity; and the spectral statistic plays a key role in the theory of large number of rare events. In the classical multinomial model, where the number of disjoint events N and their probabilities are all fixed, limit distributions of many divisible statistics have gradually been established. However, most of the results are based on the asymptotic equivalence of these statistics to Pearson's Chi-square statistic and the known limit distribution of the latter. In fact, with deeper analysis, one can conclude that the key point is not the asymptotic behavior of the Chi-square statistic, but that of the normalized frequencies. Based on the asymptotic normality of the normalized frequencies in the classical model, a unified approach to the limit theorems of more general divisible statistics can be established, of which the case of the Chi-square statistic is simply a natural corollary.
In many applications, however, the classical multinomial model is not appropriate, and an extension to new models becomes necessary. This new type of model, called "non-classical" multinomial models, considers the case when N increases and the {Pni} change as sample size n increases. As we will see, in these non-classical models, both the asymptotic normality of the normalized frequencies and the asymptotic equivalence of many divisible statistics to the Chi-square statistic are lost, and the limit theorems established in classical model are no longer valid in non-classical models.
The extension to non-classical models not only met the demands of many real world applications, but also opened a new research area in statistical analysis, which has not been thoroughly investigated so far. Although some results on the limit distributions of the divisible statistics in non-classical models have been acquired, e.g., Holst (1972); Morris (1975); Ivchenko and Levin (1976); Ivchenko and Medvedev (1979), they are far from complete. Though not yet attracting much attention by many applied statisticians, another advanced approach, introduced by Khmaladze (1984), makes use of modern martingale theory to establish functional limit theorems of the partial sum processes of divisible statistics successfully. In the main part of this thesis, we show that this martingale approach can be extended to more general situations where both Gaussian and Poissonian frequencies exist, and further discuss the properties and applications of the limiting processes, especially in constructing distribution-free statistics.
The last part of the thesis is about the statistical analysis of large number of rare events (LNRE), which is an important class of non-classical multinomial models and presented in numerous applications. In LNRE models, most of the frequencies are very small and it is not immediately clear how consistent and reliable inference can be achieved. Based on the definitions and key concepts firstly introduced by Khmaladze (1988), we discuss a particular model with the context of diversity of questionnaires. The advanced statistical techniques such as large deviation, contiguity and Edgeworth expansion used in establishing limit theorems underpin the potential of LNRE theory to become a fruitful research area in future
Comments on the infinitely divisibility of the Conway--Maxwell--Poisson distribution
In an elegant recent paper \cite{geng2022conway}, Geng and Xia settled the
question of the infinite divisibility of the Conway--Maxwell--Poisson
distribution, using in large part several results from complex analysis. In
this note we show how these complex analytic methods can be circumvented,
thereby giving a proof of their result which is completely elementary
Large Number of Rare Events: Diversity Analysis in Multiple Choice Questionnaires and Related Topics
The statistical analysis of a large number of rare events, (LNRE), which can also be called statistical theory of diversity, is the subject of acute interest both in statistical theory and in numerous applications. A careful eye will quickly see the presence of a large number of very rare objects almost everywhere: large numbers of rare species in ecosystems, large numbers of rare opinions in any opinion pool, large numbers of small admixtures in any solution and large numbers of rare words in any text are only few
examples. In studying such objects, the interest for mathematical statisticians lies in the fact that most of the frequencies are small and, therefore, difficult to deal with. It is not immediately clear how one should be able to derive consistent and reliable inference from a large number of such frequencies. In this thesis we study the diversity of questionnaires with multiple answers. It has been demonstrated that this is a particular model of LNRE theory. In our analysis, the theories of large deviation, contiguity and Edgeworth expansion were employed, and limit theorems have been established
Generalized Error Exponents For Small Sample Universal Hypothesis Testing
The small sample universal hypothesis testing problem is investigated in this
paper, in which the number of samples is smaller than the number of
possible outcomes . The goal of this work is to find an appropriate
criterion to analyze statistical tests in this setting. A suitable model for
analysis is the high-dimensional model in which both and increase to
infinity, and . A new performance criterion based on large deviations
analysis is proposed and it generalizes the classical error exponent applicable
for large sample problems (in which ). This generalized error exponent
criterion provides insights that are not available from asymptotic consistency
or central limit theorem analysis. The following results are established for
the uniform null distribution:
(i) The best achievable probability of error decays as
for some .
(ii) A class of tests based on separable statistics, including the
coincidence-based test, attains the optimal generalized error exponents.
(iii) Pearson's chi-square test has a zero generalized error exponent and
thus its probability of error is asymptotically larger than the optimal test.Comment: 43 pages, 4 figure
Asymptotic minimaxity of chi-square tests
We show that the sequence of chi-square tests is asymptotically minimax if a number of cells increases with increasing sample size. The proof utilizes Theorem about asymptotic normality of chi-square test statistics obtained under new compact assumptions
Martingale limit theorems of divisible statistics in a multinomial scheme with mixed frequencies
The martingale approach to limit theorems of divisible statistics in non-classical multinomial schemes, established by Khmaladze in 1983, has shown great power for those models with all asymptotically Poissonian frequencies. We extended this approach to more general situations, which include both asymptotically Gaussian and Poissonian frequencies, and established functional limit theorems.Functional limit theorems Divisible statistics Multinomial scheme Mixed frequencies