9,165 research outputs found
On testing the equality of high dimensional mean vectors with unequal covariance matrices
In this article, we focus on the problem of testing the equality of several
high dimensional mean vectors with unequal covariance matrices. This is one of
the most important problem in multivariate statistical analysis and there have
been various tests proposed in the literature. Motivated by \citet{BaiS96E} and
\cite{ChenQ10T}, a test statistic is introduced and the asymptomatic
distributions under the null hypothesis as well as the alternative hypothesis
are given. In addition, it is compared with a test statistic recently proposed
by \cite{SrivastavaK13Ta}. It is shown that our test statistic performs much
better especially in the large dimensional case
On the sphericity test with large-dimensional observations
In this paper, we propose corrections to the likelihood ratio test and John's
test for sphericity in large-dimensions. New formulas for the limiting
parameters in the CLT for linear spectral statistics of sample covariance
matrices with general fourth moments are first established. Using these
formulas, we derive the asymptotic distribution of the two proposed test
statistics under the null. These asymptotics are valid for general population,
i.e. not necessarily Gaussian, provided a finite fourth-moment. Extensive
Monte-Carlo experiments are conducted to assess the quality of these tests with
a comparison to several existing methods from the literature. Moreover, we also
obtain their asymptotic power functions under the alternative of a spiked
population model as a specific alternative.Comment: 37 pages, 3 figure
Comparing large covariance matrices under weak conditions on the dependence structure and its application to gene clustering
Comparing large covariance matrices has important applications in modern
genomics, where scientists are often interested in understanding whether
relationships (e.g., dependencies or co-regulations) among a large number of
genes vary between different biological states. We propose a computationally
fast procedure for testing the equality of two large covariance matrices when
the dimensions of the covariance matrices are much larger than the sample
sizes. A distinguishing feature of the new procedure is that it imposes no
structural assumptions on the unknown covariance matrices. Hence the test is
robust with respect to various complex dependence structures that frequently
arise in genomics. We prove that the proposed procedure is asymptotically valid
under weak moment conditions. As an interesting application, we derive a new
gene clustering algorithm which shares the same nice property of avoiding
restrictive structural assumptions for high-dimensional genomics data. Using an
asthma gene expression dataset, we illustrate how the new test helps compare
the covariance matrices of the genes across different gene sets/pathways
between the disease group and the control group, and how the gene clustering
algorithm provides new insights on the way gene clustering patterns differ
between the two groups. The proposed methods have been implemented in an
R-package HDtest and is available on CRAN.Comment: The original title dated back to May 2015 is "Bootstrap Tests on High
Dimensional Covariance Matrices with Applications to Understanding Gene
Clustering
"Tests for Multivariate Analysis of Variance in High Dimension Under Non-Normality"
In this article, we consider the problem of testing the equality of mean vectors of dimension ρ of several groups with a common unknown non-singular covariance matrix Σ, based on N independent observation vectors where N may be less than the dimension ρ. This problem, known in the literature as the Multivariate Analysis of variance (MANOVA) in high-dimension has recently been considered in the statistical literature by Srivastava and Fujikoshi[7], Srivastava [5] and Schott[3]. All these tests are not invariant under the change of units of measurements. On the lines of Srivastava and Du[8] and Srivastava[6], we propose a test that has the above invariance property. The null and the non-null distributions are derived under the assumption that ( N, ρ) → ∞ and N may be less than ρ and the observation vectors follow a general non-normal model.
RAPTT: An Exact Two-Sample Test in High Dimensions Using Random Projections
In high dimensions, the classical Hotelling's test tends to have low
power or becomes undefined due to singularity of the sample covariance matrix.
In this paper, this problem is overcome by projecting the data matrix onto
lower dimensional subspaces through multiplication by random matrices. We
propose RAPTT (RAndom Projection T-Test), an exact test for equality of means
of two normal populations based on projected lower dimensional data. RAPTT does
not require any constraints on the dimension of the data or the sample size. A
simulation study indicates that in high dimensions the power of this test is
often greater than that of competing tests. The advantage of RAPTT is
illustrated on high-dimensional gene expression data involving the
discrimination of tumor and normal colon tissues
- …