9,165 research outputs found

    On testing the equality of high dimensional mean vectors with unequal covariance matrices

    Full text link
    In this article, we focus on the problem of testing the equality of several high dimensional mean vectors with unequal covariance matrices. This is one of the most important problem in multivariate statistical analysis and there have been various tests proposed in the literature. Motivated by \citet{BaiS96E} and \cite{ChenQ10T}, a test statistic is introduced and the asymptomatic distributions under the null hypothesis as well as the alternative hypothesis are given. In addition, it is compared with a test statistic recently proposed by \cite{SrivastavaK13Ta}. It is shown that our test statistic performs much better especially in the large dimensional case

    On the sphericity test with large-dimensional observations

    Get PDF
    In this paper, we propose corrections to the likelihood ratio test and John's test for sphericity in large-dimensions. New formulas for the limiting parameters in the CLT for linear spectral statistics of sample covariance matrices with general fourth moments are first established. Using these formulas, we derive the asymptotic distribution of the two proposed test statistics under the null. These asymptotics are valid for general population, i.e. not necessarily Gaussian, provided a finite fourth-moment. Extensive Monte-Carlo experiments are conducted to assess the quality of these tests with a comparison to several existing methods from the literature. Moreover, we also obtain their asymptotic power functions under the alternative of a spiked population model as a specific alternative.Comment: 37 pages, 3 figure

    Comparing large covariance matrices under weak conditions on the dependence structure and its application to gene clustering

    Get PDF
    Comparing large covariance matrices has important applications in modern genomics, where scientists are often interested in understanding whether relationships (e.g., dependencies or co-regulations) among a large number of genes vary between different biological states. We propose a computationally fast procedure for testing the equality of two large covariance matrices when the dimensions of the covariance matrices are much larger than the sample sizes. A distinguishing feature of the new procedure is that it imposes no structural assumptions on the unknown covariance matrices. Hence the test is robust with respect to various complex dependence structures that frequently arise in genomics. We prove that the proposed procedure is asymptotically valid under weak moment conditions. As an interesting application, we derive a new gene clustering algorithm which shares the same nice property of avoiding restrictive structural assumptions for high-dimensional genomics data. Using an asthma gene expression dataset, we illustrate how the new test helps compare the covariance matrices of the genes across different gene sets/pathways between the disease group and the control group, and how the gene clustering algorithm provides new insights on the way gene clustering patterns differ between the two groups. The proposed methods have been implemented in an R-package HDtest and is available on CRAN.Comment: The original title dated back to May 2015 is "Bootstrap Tests on High Dimensional Covariance Matrices with Applications to Understanding Gene Clustering

    "Tests for Multivariate Analysis of Variance in High Dimension Under Non-Normality"

    Get PDF
    In this article, we consider the problem of testing the equality of mean vectors of dimension ρ of several groups with a common unknown non-singular covariance matrix Σ, based on N independent observation vectors where N may be less than the dimension ρ. This problem, known in the literature as the Multivariate Analysis of variance (MANOVA) in high-dimension has recently been considered in the statistical literature by Srivastava and Fujikoshi[7], Srivastava [5] and Schott[3]. All these tests are not invariant under the change of units of measurements. On the lines of Srivastava and Du[8] and Srivastava[6], we propose a test that has the above invariance property. The null and the non-null distributions are derived under the assumption that ( N, ρ) → ∞ and N may be less than ρ and the observation vectors follow a general non-normal model.

    RAPTT: An Exact Two-Sample Test in High Dimensions Using Random Projections

    Full text link
    In high dimensions, the classical Hotelling's T2T^2 test tends to have low power or becomes undefined due to singularity of the sample covariance matrix. In this paper, this problem is overcome by projecting the data matrix onto lower dimensional subspaces through multiplication by random matrices. We propose RAPTT (RAndom Projection T-Test), an exact test for equality of means of two normal populations based on projected lower dimensional data. RAPTT does not require any constraints on the dimension of the data or the sample size. A simulation study indicates that in high dimensions the power of this test is often greater than that of competing tests. The advantage of RAPTT is illustrated on high-dimensional gene expression data involving the discrimination of tumor and normal colon tissues