2,924 research outputs found

    Tensor Graphical Lasso (TeraLasso)

    Full text link
    This paper introduces a multi-way tensor generalization of the Bigraphical Lasso (BiGLasso), which uses a two-way sparse Kronecker-sum multivariate-normal model for the precision matrix to parsimoniously model conditional dependence relationships of matrix-variate data based on the Cartesian product of graphs. We call this generalization the {\bf Te}nsor g{\bf ra}phical Lasso (TeraLasso). We demonstrate using theory and examples that the TeraLasso model can be accurately and scalably estimated from very limited data samples of high dimensional variables with multiway coordinates such as space, time and replicates. Statistical consistency and statistical rates of convergence are established for both the BiGLasso and TeraLasso estimators of the precision matrix and estimators of its support (non-sparsity) set, respectively. We propose a scalable composite gradient descent algorithm and analyze the computational convergence rate, showing that the composite gradient descent algorithm is guaranteed to converge at a geometric rate to the global minimizer of the TeraLasso objective function. Finally, we illustrate the TeraLasso using both simulation and experimental data from a meteorological dataset, showing that we can accurately estimate precision matrices and recover meaningful conditional dependency graphs from high dimensional complex datasets.Comment: accepted to JRSS-

    Foundational principles for large scale inference: Illustrations through correlation mining

    Full text link
    When can reliable inference be drawn in the "Big Data" context? This paper presents a framework for answering this fundamental question in the context of correlation mining, with implications for general large scale inference. In large scale data applications like genomics, connectomics, and eco-informatics the dataset is often variable-rich but sample-starved: a regime where the number nn of acquired samples (statistical replicates) is far fewer than the number pp of observed variables (genes, neurons, voxels, or chemical constituents). Much of recent work has focused on understanding the computational complexity of proposed methods for "Big Data." Sample complexity however has received relatively less attention, especially in the setting when the sample size nn is fixed, and the dimension pp grows without bound. To address this gap, we develop a unified statistical framework that explicitly quantifies the sample complexity of various inferential tasks. Sampling regimes can be divided into several categories: 1) the classical asymptotic regime where the variable dimension is fixed and the sample size goes to infinity; 2) the mixed asymptotic regime where both variable dimension and sample size go to infinity at comparable rates; 3) the purely high dimensional asymptotic regime where the variable dimension goes to infinity and the sample size is fixed. Each regime has its niche but only the latter regime applies to exa-scale data dimension. We illustrate this high dimensional framework for the problem of correlation mining, where it is the matrix of pairwise and partial correlations among the variables that are of interest. We demonstrate various regimes of correlation mining based on the unifying perspective of high dimensional learning rates and sample complexity for different structured covariance models and different inference tasks

    Probabilistic Interpretation of Linear Solvers

    Full text link
    This manuscript proposes a probabilistic framework for algorithms that iteratively solve unconstrained linear problems Bx=bBx = b with positive definite BB for xx. The goal is to replace the point estimates returned by existing methods with a Gaussian posterior belief over the elements of the inverse of BB, which can be used to estimate errors. Recent probabilistic interpretations of the secant family of quasi-Newton optimization algorithms are extended. Combined with properties of the conjugate gradient algorithm, this leads to uncertainty-calibrated methods with very limited cost overhead over conjugate gradients, a self-contained novel interpretation of the quasi-Newton and conjugate gradient algorithms, and a foundation for new nonlinear optimization methods.Comment: final version, in press at SIAM J Optimizatio

    Covariance Estimation in High Dimensions via Kronecker Product Expansions

    Full text link
    This paper presents a new method for estimating high dimensional covariance matrices. The method, permuted rank-penalized least-squares (PRLS), is based on a Kronecker product series expansion of the true covariance matrix. Assuming an i.i.d. Gaussian random sample, we establish high dimensional rates of convergence to the true covariance as both the number of samples and the number of variables go to infinity. For covariance matrices of low separation rank, our results establish that PRLS has significantly faster convergence than the standard sample covariance matrix (SCM) estimator. The convergence rate captures a fundamental tradeoff between estimation error and approximation error, thus providing a scalable covariance estimation framework in terms of separation rank, similar to low rank approximation of covariance matrices. The MSE convergence rates generalize the high dimensional rates recently obtained for the ML Flip-flop algorithm for Kronecker product covariance estimation. We show that a class of block Toeplitz covariance matrices is approximatable by low separation rank and give bounds on the minimal separation rank rr that ensures a given level of bias. Simulations are presented to validate the theoretical bounds. As a real world application, we illustrate the utility of the proposed Kronecker covariance estimator for spatio-temporal linear least squares prediction of multivariate wind speed measurements.Comment: 47 pages, accepted to IEEE Transactions on Signal Processin

    Network inference in matrix-variate Gaussian models with non-independent noise

    Full text link
    Inferring a graphical model or network from observational data from a large number of variables is a well studied problem in machine learning and computational statistics. In this paper we consider a version of this problem that is relevant to the analysis of multiple phenotypes collected in genetic studies. In such datasets we expect correlations between phenotypes and between individuals. We model observations as a sum of two matrix normal variates such that the joint covariance function is a sum of Kronecker products. This model, which generalizes the Graphical Lasso, assumes observations are correlated due to known genetic relationships and corrupted with non-independent noise. We have developed a computationally efficient EM algorithm to fit this model. On simulated datasets we illustrate substantially improved performance in network reconstruction by allowing for a general noise distribution
    • …
    corecore