102 research outputs found

### How many entries of a typical orthogonal matrix can be approximated by independent normals?

We solve an open problem of Diaconis that asks what are the largest orders of
$p_n$ and $q_n$ such that $Z_n,$ the $p_n\times q_n$ upper left block of a
random matrix $\boldsymbol{\Gamma}_n$ which is uniformly distributed on the
orthogonal group O(n), can be approximated by independent standard normals?
This problem is solved by two different approximation methods. First, we show
that the variation distance between the joint distribution of entries of $Z_n$
and that of $p_nq_n$ independent standard normals goes to zero provided
$p_n=o(\sqrt{n})$ and $q_n=o(\sqrt{n})$. We also show that the above variation
distance does not go to zero if $p_n=[x\sqrt{n} ]$ and $q_n=[y\sqrt{n} ]$ for
any positive numbers $x$ and $y$. This says that the largest orders of $p_n$
and $q_n$ are $o(n^{1/2})$ in the sense of the above approximation. Second,
suppose $\boldsymbol{\Gamma}_n=(\gamma_{ij})_{n\times n}$ is generated by
performing the Gram--Schmidt algorithm on the columns of
$\bold{Y}_n=(y_{ij})_{n\times n}$, where $\{y_{ij};1\leq i,j\leq n\}$ are
i.i.d. standard normals. We show that $\epsilon_n(m):=\max_{1\leq i\leq n,1\leq
j\leq m}|\sqrt{n}\cdot\gamma_{ij}-y_{ij}|$ goes to zero in probability as long
as $m=m_n=o(n/\log n)$. We also prove that $\epsilon_n(m_n)\to 2\sqrt{\alpha}$
in probability when $m_n=[n\alpha/\log n]$ for any $\alpha>0.$ This says that
$m_n=o(n/\log n)$ is the largest order such that the entries of the first $m_n$
columns of $\boldsymbol{\Gamma}_n$ can be approximated simultaneously by
independent standard normals.Comment: Published at http://dx.doi.org/10.1214/009117906000000205 in the
Annals of Probability (http://www.imstat.org/aop/) by the Institute of
Mathematical Statistics (http://www.imstat.org

### The asymptotic distributions of the largest entries of sample correlation matrices

Let X_n=(x_{ij}) be an n by p data matrix, where the n rows form a random
sample of size n from a certain p-dimensional population distribution.
Let R_n=(\rho_{ij}) be the p\times p sample correlation matrix of X_n; that
is, the entry \rho_{ij} is the usual Pearson's correlation coefficient between
the ith column of X_n and jth column of X_n. For contemporary data both n and p
are large. When the population is a multivariate normal we study the test that
H_0: the p variates of the population are uncorrelated.
A test statistic is chosen as L_n=max_{i\ne j}|\rho_{ij}|. The asymptotic
distribution of L_n is derived by using the Chen-Stein Poisson approximation
method. Similar results for the non-Gaussian case are also derived

### Approximation of Rectangular Beta-Laguerre Ensembles and Large Deviations

We investigate the random eigenvalues coming from the beta-Laguerre ensemble
with parameter p, which is a generalization of the real, complex and quaternion
Wishart matrices of parameter (n,p). In the case that the sample size n is much
smaller than the dimension of the population distribution p, a common situation
in modern data, we approximate the beta-Laguerre ensemble by a beta-Hermite
ensemble which is a generalization of the real, complex and quaternion Wigner
matrices. As corollaries, when n is much smaller than p, we show that the
largest and smallest eigenvalues of the complex Wishart matrix are
asymptotically independent; we obtain the limiting distribution of the
condition numbers as a sum of two i.i.d. random variables with a Tracy-Widom
distribution, which is much different from the exact square case that n=p by
Edelman (1988); we propose a test procedure for a spherical hypothesis test. By
the same approximation tool, we obtain the asymptotic distribution of the
smallest eigenvalue of the beta-Laguerre ensemble. In the second part of the
paper, under the assumption that n is much smaller than p in a certain scale,
we prove the large deviation principles for three basic statistics: the largest
eigenvalue, the smallest eigenvalue and the empirical distribution of
eigenvalues, where the last large deviation is derived by using a non-standard
method

### Random restricted partitions

We study two types of probability measures on the set of integer partitions
of $n$ with at most $m$ parts. The first one chooses the random partition with
a chance related to its largest part only. We then obtain the limiting
distributions of all of the parts together and that of the largest part as $n$
tends to infinity while $m$ is fixed or tends to infinity. In particular, if
$m$ goes to infinity not fast enough, the largest part satisfies the central
limit theorem. The second measure is very general. It includes the Dirichlet
distribution and the uniform distribution as special cases. We derive the
asymptotic distributions of the parts jointly and that of the largest part by
taking limit of $n$ and $m$ in the same manner as that in the first probability
measure.Comment: 32 page

### Moments of traces of circular beta-ensembles

Let $\theta_1,\ldots,\theta_n$ be random variables from Dyson's circular
$\beta$-ensemble with probability density function $\operatorname
{Const}\cdot\prod_{1\leq j<k\leq n}|e^{i\theta_j}-e^{i\theta _k}|^{\beta}$. For
each $n\geq2$ and $\beta>0$, we obtain some inequalities on
$\mathbb{E}[p_{\mu}(Z_n)\bar{p_{\nu}(Z_n)}]$, where
$Z_n=(e^{i\theta_1},\ldots,e^{i\theta_n})$ and $p_{\mu}$ is the power-sum
symmetric function for partition $\mu$. When $\beta=2$, our inequalities
recover an identity by Diaconis and Evans for Haar-invariant unitary matrices.
Further, we have the following: $\lim_{n\to\infty}\mathbb{E}[p_{\mu}(Z_n)\bar{p_{\nu}(Z_n)}]=
\delta_{\mu\nu}(\frac{2}{\beta})^{l(\mu)}z_{\mu}$ for any $\beta>0$ and
partitions $\mu,\nu$; $\lim_{m\to\infty}\mathbb{E}[|p_m(Z_n)|^2]=n$ for any
$\beta>0$ and $n\geq2$, where $l(\mu)$ is the length of $\mu$ and $z_{\mu}$ is
explicit on $\mu$. These results apply to the three important ensembles: COE
($\beta=1$), CUE ($\beta=2$) and CSE ($\beta=4$). We further examine the
nonasymptotic behavior of $\mathbb{E}[|p_m(Z_n)|^2]$ for $\beta=1,4$. The
central limit theorems of $\sum_{j=1}^ng(e^{i\theta_j})$ are obtained when (i)
$g(z)$ is a polynomial and $\beta>0$ is arbitrary, or (ii) $g(z)$ has a Fourier
expansion and $\beta=1,4$. The main tool is the Jack function.Comment: Published at http://dx.doi.org/10.1214/14-AOP960 in the Annals of
Probability (http://www.imstat.org/aop/) by the Institute of Mathematical
Statistics (http://www.imstat.org

- …