59 research outputs found
PCA consistency in high dimension, low sample size context
Principal Component Analysis (PCA) is an important tool of dimension
reduction especially when the dimension (or the number of variables) is very
high. Asymptotic studies where the sample size is fixed, and the dimension
grows [i.e., High Dimension, Low Sample Size (HDLSS)] are becoming increasingly
relevant. We investigate the asymptotic behavior of the Principal Component
(PC) directions. HDLSS asymptotics are used to study consistency, strong
inconsistency and subspace consistency. We show that if the first few
eigenvalues of a population covariance matrix are large enough compared to the
others, then the corresponding estimated PC directions are consistent or
converge to the appropriate subspace (subspace consistency) and most other PC
directions are strongly inconsistent. Broad sets of sufficient conditions for
each of these cases are specified and the main theorem gives a catalogue of
possible combinations. In preparation for these results, we show that the
geometric representation of HDLSS data holds under general conditions, which
includes a -mixing condition and a broad range of sphericity measures of
the covariance matrix.Comment: Published in at http://dx.doi.org/10.1214/09-AOS709 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
- …