Search CORE

3,491 research outputs found

On information plus noise kernel random matrices

Author: Karoui Noureddine El
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2009
Field of study

Kernel random matrices have attracted a lot of interest in recent years, from both practical and theoretical standpoints. Most of the theoretical work so far has focused on the case were the data is sampled from a low-dimensional structure. Very recently, the first results concerning kernel random matrices with high-dimensional input data were obtained, in a setting where the data was sampled from a genuinely high-dimensional structure---similar to standard assumptions in random matrix theory. In this paper, we consider the case where the data is of the type "information

{}+{}

noise." In other words, each observation is the sum of two independent elements: one sampled from a "low-dimensional" structure, the signal part of the data, the other being high-dimensional noise, normalized to not overwhelm but still affect the signal. We consider two types of noise, spherical and elliptical. In the spherical setting, we show that the spectral properties of kernel random matrices can be understood from a new kernel matrix, computed only from the signal part of the data, but using (in general) a slightly different kernel. The Gaussian kernel has some special properties in this setting. The elliptical setting, which is important from a robustness standpoint, is less prone to easy interpretation.Comment: Published in at http://dx.doi.org/10.1214/10-AOS801 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

Tracy--Widom limit for the largest eigenvalue of a large class of complex sample covariance matrices

Author: Karoui Noureddine El
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2007
Field of study

We consider the asymptotic fluctuation behavior of the largest eigenvalue of certain sample covariance matrices in the asymptotic regime where both dimensions of the corresponding data matrix go to infinity. More precisely, let

X

be an

n\times p

matrix, and let its rows be i.i.d. complex normal vectors with mean 0 and covariance

\Sigma_p

. We show that for a large class of covariance matrices

\Sigma_p

, the largest eigenvalue of

X^*X

is asymptotically distributed (after recentering and rescaling) as the Tracy--Widom distribution that appears in the study of the Gaussian unitary ensemble. We give explicit formulas for the centering and scaling sequences that are easy to implement and involve only the spectral distribution of the population covariance,

n

and

p

. The main theorem applies to a number of covariance models found in applications. For example, well-behaved Toeplitz matrices as well as covariance matrices whose spectral distribution is a sum of atoms (under some conditions on the mass of the atoms) are among the models the theorem can handle. Generalizations of the theorem to certain spiked versions of our models and a.s. results about the largest eigenvalue are given. We also discuss a simple corollary that does not require normality of the entries of the data matrix and some consequences for applications in multivariate statistics.Comment: Published at http://dx.doi.org/10.1214/009117906000000917 in the Annals of Probability (http://www.imstat.org/aop/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

Concentration of measure and spectra of random matrices: Applications to correlation matrices, elliptical distributions and beyond

Author: Karoui Noureddine El
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2009
Field of study

We place ourselves in the setting of high-dimensional statistical inference, where the number of variables

p

in a data set of interest is of the same order of magnitude as the number of observations

n

. More formally, we study the asymptotic properties of correlation and covariance matrices, in the setting where

p/n\to\rho\in(0,\infty),

for general population covariance. We show that, for a large class of models studied in random matrix theory, spectral properties of large-dimensional correlation matrices are similar to those of large-dimensional covarance matrices. We also derive a Mar\u{c}enko--Pastur-type system of equations for the limiting spectral distribution of covariance matrices computed from data with elliptical distributions and generalizations of this family. The motivation for this study comes partly from the possible relevance of such distributional assumptions to problems in econometrics and portfolio optimization, as well as robustness questions for certain classical random matrix results. A mathematical theme of the paper is the important use we make of concentration inequalities.Comment: Published in at http://dx.doi.org/10.1214/08-AAP548 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

The spectrum of kernel random matrices

Author: Karoui Noureddine El
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 04/01/2010
Field of study

We place ourselves in the setting of high-dimensional statistical inference where the number of variables

p

in a dataset of interest is of the same order of magnitude as the number of observations

n

. We consider the spectrum of certain kernel random matrices, in particular

n\times n

matrices whose

(i,j)

th entry is

f(X_i'X_j/p)

f(\Vert X_i-X_j\Vert^2/p)

where

p

is the dimension of the data, and

X_i

are independent data vectors. Here

f

is assumed to be a locally smooth function. The study is motivated by questions arising in statistics and computer science where these matrices are used to perform, among other things, nonlinear versions of principal component analysis. Surprisingly, we show that in high-dimensions, and for the models we analyze, the problem becomes essentially linear--which is at odds with heuristics sometimes used to justify the usage of these methods. The analysis also highlights certain peculiarities of models widely studied in random matrix theory and raises some questions about their relevance as tools to model high-dimensional data encountered in practice.Comment: Published in at http://dx.doi.org/10.1214/08-AOS648 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

A rate of convergence result for the largest eigenvalue of complex white Wishart matrices

Author: Karoui Noureddine El
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2007
Field of study

It has been recently shown that if

X

is an

n\times N

matrix whose entries are i.i.d. standard complex Gaussian and

l_1

is the largest eigenvalue of

X^*X

, there exist sequences

m_{n,N}

and

s_{n,N}

such that

(l_1-m_{n,N})/s_{n,N}

converges in distribution to

W_2

, the Tracy--Widom law appearing in the study of the Gaussian unitary ensemble. This probability law has a density which is known and computable. The cumulative distribution function of

W_2

is denoted

F_2

. In this paper we show that, under the assumption that

n/N\to \gamma\in(0,\infty)

, we can find a function

M

, continuous and nonincreasing, and sequences

\tilde{\mu}_{n,N}

and

\tilde{\sigma}_{n,N}

such that, for all real

s_0

, there exists an integer

N(s_0,\gamma)

for which, if

(n\wedge N)\geq N(s_0,\gamma)

, we have, with

l_{n,N}=(l_1-\tilde{\mu}_{n,N})/\tilde{\sigma}_{n,N}

\forall s\geq s_0\qquad (n\wedge N)^{2/3}|P(l_{n,N}\leq s)-F_2(s)|\leq M(s_0)\exp(-s).

The surprisingly good 2/3 rate and qualitative properties of the bounding function help explain the fact that the limiting distribution

W_2

is a good approximation to the empirical distribution of

l_{n,N}

in simulations, an important fact from the point of view of (e.g., statistical) applications.Comment: Published at http://dx.doi.org/10.1214/009117906000000502 in the Annals of Probability (http://www.imstat.org/aop/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

Uniform approximation and explicit estimates for the prolate spheroidal wave functions

Author: Bonami Aline
Karoui Abderrazek
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/10/2014
Field of study

For fixed

c,

Prolate Spheroidal Wave Functions (PSWFs), denoted by

\psi_{n, c},

form an orthogonal basis with remarkable properties for the space of band-limited functions with bandwith

c

. They have been largely studied and used after the seminal work of D. Slepian and his co-authors. In several applications, uniform estimates of the

\psi_{n,c}

n

and

c,

are needed. To progress in this direction, we push forward the uniform approximation error bounds and give an explicit approximation of their values at

1

in terms of the Legendre complete elliptic integral of the first kind. Also, we give an explicit formula for the accurate approximation the eigenvalues of the Sturm-Liouville operator associated with the PSWFs

arXiv.org e-Print Archive

HAL Descartes