3,491 research outputs found
On information plus noise kernel random matrices
Kernel random matrices have attracted a lot of interest in recent years, from
both practical and theoretical standpoints. Most of the theoretical work so far
has focused on the case were the data is sampled from a low-dimensional
structure. Very recently, the first results concerning kernel random matrices
with high-dimensional input data were obtained, in a setting where the data was
sampled from a genuinely high-dimensional structure---similar to standard
assumptions in random matrix theory. In this paper, we consider the case where
the data is of the type "informationnoise." In other words, each
observation is the sum of two independent elements: one sampled from a
"low-dimensional" structure, the signal part of the data, the other being
high-dimensional noise, normalized to not overwhelm but still affect the
signal. We consider two types of noise, spherical and elliptical. In the
spherical setting, we show that the spectral properties of kernel random
matrices can be understood from a new kernel matrix, computed only from the
signal part of the data, but using (in general) a slightly different kernel.
The Gaussian kernel has some special properties in this setting. The elliptical
setting, which is important from a robustness standpoint, is less prone to easy
interpretation.Comment: Published in at http://dx.doi.org/10.1214/10-AOS801 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Tracy--Widom limit for the largest eigenvalue of a large class of complex sample covariance matrices
We consider the asymptotic fluctuation behavior of the largest eigenvalue of
certain sample covariance matrices in the asymptotic regime where both
dimensions of the corresponding data matrix go to infinity. More precisely, let
be an matrix, and let its rows be i.i.d. complex normal vectors
with mean 0 and covariance . We show that for a large class of
covariance matrices , the largest eigenvalue of is
asymptotically distributed (after recentering and rescaling) as the
Tracy--Widom distribution that appears in the study of the Gaussian unitary
ensemble. We give explicit formulas for the centering and scaling sequences
that are easy to implement and involve only the spectral distribution of the
population covariance, and . The main theorem applies to a number of
covariance models found in applications. For example, well-behaved Toeplitz
matrices as well as covariance matrices whose spectral distribution is a sum of
atoms (under some conditions on the mass of the atoms) are among the models the
theorem can handle. Generalizations of the theorem to certain spiked versions
of our models and a.s. results about the largest eigenvalue are given. We also
discuss a simple corollary that does not require normality of the entries of
the data matrix and some consequences for applications in multivariate
statistics.Comment: Published at http://dx.doi.org/10.1214/009117906000000917 in the
Annals of Probability (http://www.imstat.org/aop/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Concentration of measure and spectra of random matrices: Applications to correlation matrices, elliptical distributions and beyond
We place ourselves in the setting of high-dimensional statistical inference,
where the number of variables in a data set of interest is of the same
order of magnitude as the number of observations . More formally, we study
the asymptotic properties of correlation and covariance matrices, in the
setting where for general population covariance. We
show that, for a large class of models studied in random matrix theory,
spectral properties of large-dimensional correlation matrices are similar to
those of large-dimensional covarance matrices. We also derive a
Mar\u{c}enko--Pastur-type system of equations for the limiting spectral
distribution of covariance matrices computed from data with elliptical
distributions and generalizations of this family. The motivation for this study
comes partly from the possible relevance of such distributional assumptions to
problems in econometrics and portfolio optimization, as well as robustness
questions for certain classical random matrix results. A mathematical theme of
the paper is the important use we make of concentration inequalities.Comment: Published in at http://dx.doi.org/10.1214/08-AAP548 the Annals of
Applied Probability (http://www.imstat.org/aap/) by the Institute of
Mathematical Statistics (http://www.imstat.org
The spectrum of kernel random matrices
We place ourselves in the setting of high-dimensional statistical inference
where the number of variables in a dataset of interest is of the same order
of magnitude as the number of observations . We consider the spectrum of
certain kernel random matrices, in particular matrices whose
th entry is or where is
the dimension of the data, and are independent data vectors. Here is
assumed to be a locally smooth function. The study is motivated by questions
arising in statistics and computer science where these matrices are used to
perform, among other things, nonlinear versions of principal component
analysis. Surprisingly, we show that in high-dimensions, and for the models we
analyze, the problem becomes essentially linear--which is at odds with
heuristics sometimes used to justify the usage of these methods. The analysis
also highlights certain peculiarities of models widely studied in random matrix
theory and raises some questions about their relevance as tools to model
high-dimensional data encountered in practice.Comment: Published in at http://dx.doi.org/10.1214/08-AOS648 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
A rate of convergence result for the largest eigenvalue of complex white Wishart matrices
It has been recently shown that if is an matrix whose entries
are i.i.d. standard complex Gaussian and is the largest eigenvalue of
, there exist sequences and such that
converges in distribution to , the Tracy--Widom
law appearing in the study of the Gaussian unitary ensemble. This probability
law has a density which is known and computable. The cumulative distribution
function of is denoted . In this paper we show that, under the
assumption that , we can find a function ,
continuous and nonincreasing, and sequences and
such that, for all real , there exists an integer
for which, if , we have, with
, The
surprisingly good 2/3 rate and qualitative properties of the bounding function
help explain the fact that the limiting distribution is a good
approximation to the empirical distribution of in simulations, an
important fact from the point of view of (e.g., statistical) applications.Comment: Published at http://dx.doi.org/10.1214/009117906000000502 in the
Annals of Probability (http://www.imstat.org/aop/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Uniform approximation and explicit estimates for the prolate spheroidal wave functions
For fixed Prolate Spheroidal Wave Functions (PSWFs), denoted by
form an orthogonal basis with remarkable properties for the
space of band-limited functions with bandwith . They have been largely
studied and used after the seminal work of D. Slepian and his co-authors. In
several applications, uniform estimates of the in and are
needed. To progress in this direction, we push forward the uniform
approximation error bounds and give an explicit approximation of their values
at in terms of the
Legendre complete elliptic integral of the first kind. Also, we give an
explicit formula for the accurate approximation the eigenvalues of the
Sturm-Liouville operator associated with the PSWFs
- …