123 research outputs found
Shifting Goals And Mounting Challenges For Statistical Methodology
Modern interdisciplinary research in statistical science encompasses a wide field: agriculture, biology, biomedical sciences along with bioinformatics, clinical sciences, education, environmental and public health disciplines, genomic science, industry, molecular genetics, socio-behavior, socio-economics, toxicology, and a variety of other disciplines. Statistical science has historically had mathematical perspectives dominating theoretical and methodological developments. Yet, the advent of modern information technology has opened the doors for highly computation intensive statistical tools (i.e., software), wherein mathematical aspects are often de-emphasized. Knowledge discovery and data mining (KDDM) is now becoming a dominating force, with bioinformatics as a notable example. In view of this apparent discordance between mathematical (frequentist as well as Bayesian) and computational approaches to statistical resolutions, and a genuine need to formulate training as well as research curricula to meet growing demands, a critical appraisal of statistical innovations is made with due respect to its mathematical heritage, as well as scope of application. Some of the challenging statistical tasks are illustrated
Kendall's tau in high-dimensional genomic parsimony
High-dimensional data models, often with low sample size, abound in many
interdisciplinary studies, genomics and large biological systems being most
noteworthy. The conventional assumption of multinormality or linearity of
regression may not be plausible for such models which are likely to be
statistically complex due to a large number of parameters as well as various
underlying restraints. As such, parametric approaches may not be very
effective. Anything beyond parametrics, albeit, having increased scope and
robustness perspectives, may generally be baffled by the low sample size and
hence unable to give reasonable margins of errors. Kendall's tau statistic is
exploited in this context with emphasis on dimensional rather than sample size
asymptotics. The Chen--Stein theorem has been thoroughly appraised in this
study. Applications of these findings in some microarray data models are
illustrated.Comment: Published in at http://dx.doi.org/10.1214/074921708000000183 the IMS
Collections (http://www.imstat.org/publications/imscollections.htm) by the
Institute of Mathematical Statistics (http://www.imstat.org
An asymptotically normal test for the selective neutrality hypothesis
An important parameter in the study of population evolution is
, where is the effective population size and is the
rate of mutation per locus per generation. Therefore, represents the
mean number of mutations per site per generation. There are many estimators of
, one of them being the mean number of pairwise nucleotide differences,
which we call . Other estimators are , based on
the number of segregating sites and , based on the number of
singletons. The concept of selective neutrality can be interpreted as a
differentiated nucleotide distribution for mutant sites when compared to the
overall nucleotide distribution. Tajima (1989) has proposed the so-called
Tajima's test of selective neutrality based on .
Its complex empirical behavior (Kiihl, 2005) motivates us to propose a test
statistic solely based on . We are thus able to prove asymptotic
normality under different assumptions on the number of sequences and number of
sites via -statistics theory.Comment: Published in at http://dx.doi.org/10.1214/193940307000000293 the IMS
Collections (http://www.imstat.org/publications/imscollections.htm) by the
Institute of Mathematical Statistics (http://www.imstat.org
Two-Stage Likelihood Ratio and Union–Intersection Tests for One-Sided Alternatives Multivariate Mean with Nuisance Dispersion Matrix
For a multinormal distribution with an unknown dispersion matrix, union-intersection (UI) tests for the mean against one-sided alternatives are considered. The
null distribution of the UI test statistic is derived and its power monotonicity
properties are studied. A Stain-type two-stage procedure is proposed to eliminate
some of the inherent drawbacks of such tests. Some comparisons are also made
with some recently proposed alternative conditional likelihood ratio tests
Second-Order Pitman Admissibility and Pitman Closeness: The Multiparameter Case and Stein-Rule Estimators
In a multiparameter estimation problem, for first-order efficient estimators, second-order Pitman admissibility, and Pitman closeness properties are studied. Bearing in mind the dominant role of Stein-rule estimators in multiparameter estimation theory, such second-order properties are also studied for shrinkage maximum likelihood estimators
A New Smooth Density Estimator for Non-Negative Random Variables
Commonly used kernel density estimators may not provide admissible values of the density or its functionals at the boundaries for densities with restricted support. For smoothing the empirical distribution a generalization of the Hille's lemma, considered here, alleviates some of the problems of kernel density estimator near the boundaries. For nonnegative random variables which crop up in reliability and survival analysis, the proposed procedure is
thoroughly explored; its consistency and asymptotic distributional results are established under appropriate regularity assumptions. Methods of obtaining smoothing parameters
through cross-validation are given, and graphical illustrations of the estimator for continuous
(at zero) as well as discontinuous densities are provided
- …