22,091 research outputs found

    Fast robust correlation for high-dimensional data

    Full text link
    The product moment covariance is a cornerstone of multivariate data analysis, from which one can derive correlations, principal components, Mahalanobis distances and many other results. Unfortunately the product moment covariance and the corresponding Pearson correlation are very susceptible to outliers (anomalies) in the data. Several robust measures of covariance have been developed, but few are suitable for the ultrahigh dimensional data that are becoming more prevalent nowadays. For that one needs methods whose computation scales well with the dimension, are guaranteed to yield a positive semidefinite covariance matrix, and are sufficiently robust to outliers as well as sufficiently accurate in the statistical sense of low variability. We construct such methods using data transformations. The resulting approach is simple, fast and widely applicable. We study its robustness by deriving influence functions and breakdown values, and computing the mean squared error on contaminated data. Using these results we select a method that performs well overall. This also allows us to construct a faster version of the DetectDeviatingCells method (Rousseeuw and Van den Bossche, 2018) to detect cellwise outliers, that can deal with much higher dimensions. The approach is illustrated on genomic data with 12,000 variables and color video data with 920,000 dimensions

    Intrinsic data depth for Hermitian positive definite matrices

    Full text link
    Nondegenerate covariance, correlation and spectral density matrices are necessarily symmetric or Hermitian and positive definite. The main contribution of this paper is the development of statistical data depths for collections of Hermitian positive definite matrices by exploiting the geometric structure of the space as a Riemannian manifold. The depth functions allow one to naturally characterize most central or outlying matrices, but also provide a practical framework for inference in the context of samples of positive definite matrices. First, the desired properties of an intrinsic data depth function acting on the space of Hermitian positive definite matrices are presented. Second, we propose two computationally fast pointwise and integrated data depth functions that satisfy each of these requirements and investigate several robustness and efficiency aspects. As an application, we construct depth-based confidence regions for the intrinsic mean of a sample of positive definite matrices, which is applied to the exploratory analysis of a collection of covariance matrices associated to a multicenter research trial

    Essentially All Gaussian Two-Party Quantum States are a priori Nonclassical but Classically Correlated

    Get PDF
    Duan, Giedke, Cirac and Zoller (quant-ph/9908056) and, independently, Simon (quant-ph/9909044) have recently found necessary and sufficient conditions for the separability (classical correlation) of the Gaussian two-party (continuous variable) states. Duan et al remark that their criterion is based on a "much stronger bound" on the total variance of a pair of Einstein-Podolsky-Rosen-type operators than is required simply by the uncertainty relation. Here, we seek to formalize and test this particular assertion in both classical and quantum-theoretic frameworks. We first attach to these states the classical a priori probability (Jeffreys' prior), proportional to the volume element of the Fisher information metric on the Riemannian manifold of Gaussian (quadrivariate normal) probability distributions. Then, numerical evidence indicates that more than ninety-nine percent of the Gaussian two-party states do, in fact, meet the more stringent criterion for separability. We collaterally note that the prior probability assigned to the classical states, that is those having positive Glauber-Sudarshan P-representations, is less than one-thousandth of one percent. We, then, seek to attach as a measure to the Gaussian two-party states, the volume element of the associated (quantum-theoretic) Bures (minimal monotone) metric. Our several extensive analyses, then, persistently yield probabilities of separability and classicality that are, to very high orders of accuracy, unity and zero, respectively, so the two apparently quite distinct (classical and quantum-theoretic) forms of analysis are rather remarkably consistent in their findings.Comment: Seven pages, one table. Expanded introduction, additional references include
    corecore