100,443 research outputs found
Covariance Estimation in High Dimensions via Kronecker Product Expansions
This paper presents a new method for estimating high dimensional covariance
matrices. The method, permuted rank-penalized least-squares (PRLS), is based on
a Kronecker product series expansion of the true covariance matrix. Assuming an
i.i.d. Gaussian random sample, we establish high dimensional rates of
convergence to the true covariance as both the number of samples and the number
of variables go to infinity. For covariance matrices of low separation rank,
our results establish that PRLS has significantly faster convergence than the
standard sample covariance matrix (SCM) estimator. The convergence rate
captures a fundamental tradeoff between estimation error and approximation
error, thus providing a scalable covariance estimation framework in terms of
separation rank, similar to low rank approximation of covariance matrices. The
MSE convergence rates generalize the high dimensional rates recently obtained
for the ML Flip-flop algorithm for Kronecker product covariance estimation. We
show that a class of block Toeplitz covariance matrices is approximatable by
low separation rank and give bounds on the minimal separation rank that
ensures a given level of bias. Simulations are presented to validate the
theoretical bounds. As a real world application, we illustrate the utility of
the proposed Kronecker covariance estimator for spatio-temporal linear least
squares prediction of multivariate wind speed measurements.Comment: 47 pages, accepted to IEEE Transactions on Signal Processin
Validation of nonlinear PCA
Linear principal component analysis (PCA) can be extended to a nonlinear PCA
by using artificial neural networks. But the benefit of curved components
requires a careful control of the model complexity. Moreover, standard
techniques for model selection, including cross-validation and more generally
the use of an independent test set, fail when applied to nonlinear PCA because
of its inherent unsupervised characteristics. This paper presents a new
approach for validating the complexity of nonlinear PCA models by using the
error in missing data estimation as a criterion for model selection. It is
motivated by the idea that only the model of optimal complexity is able to
predict missing values with the highest accuracy. While standard test set
validation usually favours over-fitted nonlinear PCA models, the proposed model
validation approach correctly selects the optimal model complexity.Comment: 12 pages, 5 figure
- …