1,568 research outputs found

    Minimax Rate-Optimal Estimation of High-Dimensional Covariance Matrices with Incomplete Data

    Get PDF
    Missing data occur frequently in a wide range of applications. In this paper, we consider estimation of high-dimensional covariance matrices in the presence of missing observations under a general missing completely at random model in the sense that the missingness is not dependent on the values of the data. Based on incomplete data, estimators for bandable and sparse covariance matrices are proposed and their theoretical and numerical properties are investigated. Minimax rates of convergence are established under the spectral norm loss and the proposed estimators are shown to be rate-optimal under mild regularity conditions. Simulation studies demonstrate that the estimators perform well numerically. The methods are also illustrated through an application to data from four ovarian cancer studies. The key technical tools developed in this paper are of independent interest and potentially useful for a range of related problems in high-dimensional statistical inference with missing data

    Posterior contraction in sparse Bayesian factor models for massive covariance matrices

    Get PDF
    Sparse Bayesian factor models are routinely implemented for parsimonious dependence modeling and dimensionality reduction in high-dimensional applications. We provide theoretical understanding of such Bayesian procedures in terms of posterior convergence rates in inferring high-dimensional covariance matrices where the dimension can be larger than the sample size. Under relevant sparsity assumptions on the true covariance matrix, we show that commonly-used point mass mixture priors on the factor loadings lead to consistent estimation in the operator norm even when pnp\gg n. One of our major contributions is to develop a new class of continuous shrinkage priors and provide insights into their concentration around sparse vectors. Using such priors for the factor loadings, we obtain similar rate of convergence as obtained with point mass mixture priors. To obtain the convergence rates, we construct test functions to separate points in the space of high-dimensional covariance matrices using insights from random matrix theory; the tools developed may be of independent interest. We also derive minimax rates and show that the Bayesian posterior rates of convergence coincide with the minimax rates upto a logn\sqrt{\log n} term.Comment: Published in at http://dx.doi.org/10.1214/14-AOS1215 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Exponentially convergent data assimilation algorithm for Navier-Stokes equations

    Full text link
    The paper presents a new state estimation algorithm for a bilinear equation representing the Fourier- Galerkin (FG) approximation of the Navier-Stokes (NS) equations on a torus in R2. This state equation is subject to uncertain but bounded noise in the input (Kolmogorov forcing) and initial conditions, and its output is incomplete and contains bounded noise. The algorithm designs a time-dependent gain such that the estimation error converges to zero exponentially. The sufficient condition for the existence of the gain are formulated in the form of algebraic Riccati equations. To demonstrate the results we apply the proposed algorithm to the reconstruction a chaotic fluid flow from incomplete and noisy data
    corecore