22,091 research outputs found
Fast robust correlation for high-dimensional data
The product moment covariance is a cornerstone of multivariate data analysis,
from which one can derive correlations, principal components, Mahalanobis
distances and many other results. Unfortunately the product moment covariance
and the corresponding Pearson correlation are very susceptible to outliers
(anomalies) in the data. Several robust measures of covariance have been
developed, but few are suitable for the ultrahigh dimensional data that are
becoming more prevalent nowadays. For that one needs methods whose computation
scales well with the dimension, are guaranteed to yield a positive semidefinite
covariance matrix, and are sufficiently robust to outliers as well as
sufficiently accurate in the statistical sense of low variability. We construct
such methods using data transformations. The resulting approach is simple, fast
and widely applicable. We study its robustness by deriving influence functions
and breakdown values, and computing the mean squared error on contaminated
data. Using these results we select a method that performs well overall. This
also allows us to construct a faster version of the DetectDeviatingCells method
(Rousseeuw and Van den Bossche, 2018) to detect cellwise outliers, that can
deal with much higher dimensions. The approach is illustrated on genomic data
with 12,000 variables and color video data with 920,000 dimensions
Intrinsic data depth for Hermitian positive definite matrices
Nondegenerate covariance, correlation and spectral density matrices are
necessarily symmetric or Hermitian and positive definite. The main contribution
of this paper is the development of statistical data depths for collections of
Hermitian positive definite matrices by exploiting the geometric structure of
the space as a Riemannian manifold. The depth functions allow one to naturally
characterize most central or outlying matrices, but also provide a practical
framework for inference in the context of samples of positive definite
matrices. First, the desired properties of an intrinsic data depth function
acting on the space of Hermitian positive definite matrices are presented.
Second, we propose two computationally fast pointwise and integrated data depth
functions that satisfy each of these requirements and investigate several
robustness and efficiency aspects. As an application, we construct depth-based
confidence regions for the intrinsic mean of a sample of positive definite
matrices, which is applied to the exploratory analysis of a collection of
covariance matrices associated to a multicenter research trial
Essentially All Gaussian Two-Party Quantum States are a priori Nonclassical but Classically Correlated
Duan, Giedke, Cirac and Zoller (quant-ph/9908056) and, independently, Simon
(quant-ph/9909044) have recently found necessary and sufficient conditions for
the separability (classical correlation) of the Gaussian two-party (continuous
variable) states. Duan et al remark that their criterion is based on a "much
stronger bound" on the total variance of a pair of Einstein-Podolsky-Rosen-type
operators than is required simply by the uncertainty relation. Here, we seek to
formalize and test this particular assertion in both classical and
quantum-theoretic frameworks. We first attach to these states the classical a
priori probability (Jeffreys' prior), proportional to the volume element of the
Fisher information metric on the Riemannian manifold of Gaussian (quadrivariate
normal) probability distributions. Then, numerical evidence indicates that more
than ninety-nine percent of the Gaussian two-party states do, in fact, meet the
more stringent criterion for separability. We collaterally note that the prior
probability assigned to the classical states, that is those having positive
Glauber-Sudarshan P-representations, is less than one-thousandth of one
percent. We, then, seek to attach as a measure to the Gaussian two-party
states, the volume element of the associated (quantum-theoretic) Bures (minimal
monotone) metric. Our several extensive analyses, then, persistently yield
probabilities of separability and classicality that are, to very high orders of
accuracy, unity and zero, respectively, so the two apparently quite distinct
(classical and quantum-theoretic) forms of analysis are rather remarkably
consistent in their findings.Comment: Seven pages, one table. Expanded introduction, additional references
include
- …