45 research outputs found
Statistical thresholds for Tensor PCA
We study the statistical limits of testing and estimation for a rank one
deformation of a Gaussian random tensor. We compute the sharp thresholds for
hypothesis testing and estimation by maximum likelihood and show that they are
the same. Furthermore, we find that the maximum likelihood estimator achieves
the maximal correlation with the planted vector among measurable estimators
above the estimation threshold. In this setting, the maximum likelihood
estimator exhibits a discontinuous BBP-type transition: below the critical
threshold the estimator is orthogonal to the planted vector, but above the
critical threshold, it achieves positive correlation which is uniformly bounded
away from zero
Cleaning large correlation matrices: tools from random matrix theory
This review covers recent results concerning the estimation of large
covariance matrices using tools from Random Matrix Theory (RMT). We introduce
several RMT methods and analytical techniques, such as the Replica formalism
and Free Probability, with an emphasis on the Marchenko-Pastur equation that
provides information on the resolvent of multiplicatively corrupted noisy
matrices. Special care is devoted to the statistics of the eigenvectors of the
empirical correlation matrix, which turn out to be crucial for many
applications. We show in particular how these results can be used to build
consistent "Rotationally Invariant" estimators (RIE) for large correlation
matrices when there is no prior on the structure of the underlying process. The
last part of this review is dedicated to some real-world applications within
financial markets as a case in point. We establish empirically the efficacy of
the RIE framework, which is found to be superior in this case to all previously
proposed methods. The case of additively (rather than multiplicatively)
corrupted noisy matrices is also dealt with in a special Appendix. Several open
problems and interesting technical developments are discussed throughout the
paper.Comment: 165 pages, article submitted to Physics Report
Extreme eigenvalues of large-dimensional spiked Fisher matrices with application
Consider two pp-variate populations, not necessarily Gaussian, with covariance matrices Σ1Σ1 and Σ2Σ2, respectively. Let S1S1 and S2S2 be the corresponding sample covariance matrices with degrees of freedom mm and nn. When the difference ΔΔ between Σ1Σ1 and Σ2Σ2 is of small rank compared to p,mp,m and nn, the Fisher matrix S:=S−12S1S:=S2−1S1 is called a spiked Fisher matrix. When p,mp,m and nn grow to infinity proportionally, we establish a phase transition for the extreme eigenvalues of the Fisher matrix: a displacement formula showing that when the eigenvalues of ΔΔ (spikes) are above (or under) a critical value, the associated extreme eigenvalues of SS will converge to some point outside the support of the global limit (LSD) of other eigenvalues (become outliers); otherwise, they will converge to the edge points of the LSD. Furthermore, we derive central limit theorems for those outlier eigenvalues of SS. The limiting distributions are found to be Gaussian if and only if the corresponding population spike eigenvalues in ΔΔ are simple. Two applications are introduced. The first application uses the largest eigenvalue of the Fisher matrix to test the equality between two high-dimensional covariance matrices, and explicit power function is found under the spiked alternative. The second application is in the field of signal detection, where an estimator for the number of signals is proposed while the covariance structure of the noise is arbitrary.published_or_final_versio