13 research outputs found

    Joint Covariance Estimation with Mutual Linear Structure

    Full text link
    We consider the problem of joint estimation of structured covariance matrices. Assuming the structure is unknown, estimation is achieved using heterogeneous training sets. Namely, given groups of measurements coming from centered populations with different covariances, our aim is to determine the mutual structure of these covariance matrices and estimate them. Supposing that the covariances span a low dimensional affine subspace in the space of symmetric matrices, we develop a new efficient algorithm discovering the structure and using it to improve the estimation. Our technique is based on the application of principal component analysis in the matrix space. We also derive an upper performance bound of the proposed algorithm in the Gaussian scenario and compare it with the Cramer-Rao lower bound. Numerical simulations are presented to illustrate the performance benefits of the proposed method

    Tyler's Covariance Matrix Estimator in Elliptical Models with Convex Structure

    Full text link
    We address structured covariance estimation in elliptical distributions by assuming that the covariance is a priori known to belong to a given convex set, e.g., the set of Toeplitz or banded matrices. We consider the General Method of Moments (GMM) optimization applied to robust Tyler's scatter M-estimator subject to these convex constraints. Unfortunately, GMM turns out to be non-convex due to the objective. Instead, we propose a new COCA estimator - a convex relaxation which can be efficiently solved. We prove that the relaxation is tight in the unconstrained case for a finite number of samples, and in the constrained case asymptotically. We then illustrate the advantages of COCA in synthetic simulations with structured compound Gaussian distributions. In these examples, COCA outperforms competing methods such as Tyler's estimator and its projection onto the structure set.Comment: arXiv admin note: text overlap with arXiv:1311.059

    Diagonal and Low-Rank Matrix Decompositions, Correlation Matrices, and Ellipsoid Fitting

    Get PDF
    In this paper we establish links between, and new results for, three problems that are not usually considered together. The first is a matrix decomposition problem that arises in areas such as statistical modeling and signal processing: given a matrix XX formed as the sum of an unknown diagonal matrix and an unknown low rank positive semidefinite matrix, decompose XX into these constituents. The second problem we consider is to determine the facial structure of the set of correlation matrices, a convex set also known as the elliptope. This convex body, and particularly its facial structure, plays a role in applications from combinatorial optimization to mathematical finance. The third problem is a basic geometric question: given points v1,v2,...,vn∈Rkv_1,v_2,...,v_n\in \R^k (where n>kn > k) determine whether there is a centered ellipsoid passing \emph{exactly} through all of the points. We show that in a precise sense these three problems are equivalent. Furthermore we establish a simple sufficient condition on a subspace UU that ensures any positive semidefinite matrix LL with column space UU can be recovered from D+LD+L for any diagonal matrix DD using a convex optimization-based heuristic known as minimum trace factor analysis. This result leads to a new understanding of the structure of rank-deficient correlation matrices and a simple condition on a set of points that ensures there is a centered ellipsoid passing through them.Comment: 20 page

    Learning permutation symmetries with gips in R

    Full text link
    The study of hidden structures in data presents challenges in modern statistics and machine learning. We introduce the gips\mathbf{gips} package in R, which identifies permutation subgroup symmetries in Gaussian vectors. gips\mathbf{gips} serves two main purposes: exploratory analysis in discovering hidden permutation symmetries and estimating the covariance matrix under permutation symmetry. It is competitive to canonical methods in dimensionality reduction while providing a new interpretation of the results. gips\mathbf{gips} implements a novel Bayesian model selection procedure within Gaussian vectors invariant under the permutation subgroup introduced in Graczyk, Ishi, Ko{\l}odziejek, Massam, Annals of Statistics, 50 (3) (2022).Comment: 36 pages, 11 figure

    Matrix compression along isogenic blocks

    Get PDF
    A matrix-compression algorithm is derived from a novel isogenic block decomposition for square matrices. The resulting compression and inflation operations possess strong functorial and spectral-permanence properties. The basic observation that Hadamard entrywise functional calculus preserves isogenic blocks has already proved to be of paramount importance for thresholding large correlation matrices. The proposed isogenic stratification of the set of complex matrices bears similarities to the Schubert cell stratification of a homogeneous algebraic manifold. An array of potential applications to current investigations in computational matrix analysis is briefly mentioned, touching concepts such as symmetric statistical models, hierarchical matrices and coherent matrix organization induced by partition trees

    Robustness and invariance in the generalization error of deep neural networks

    Get PDF
    In recent years Deep Neural Networks (DNNs) have achieved state-of-the-art results in many fields such as speech recognition, computer vision and others. Despite their success in practice, many theoretical fundamentals of DNNs are still not clear. One of them is the generalization error of DNNs, which is the topic of this thesis. The thesis first reviews the theory and practice of DNNs focusing specifically on theoretical results that provide generalization error bounds. We argue that the current state-of-the-art theoretical results, which rely on the width and depth of deep neural networks, do not apply in many practical scenarios where the networks are very wide or very deep. A novel approach to the theoretical analysis of the generalization error of DNNs is proposed next. The proposed approach relies on the classification margin of the DNN and on the complexity of the data. As this result does not rely on the width or the depth of the network it provides a rationale behind the practical success of learning with very wide and deep neural networks. These results are then extended to learning problems where symmetries are present in the data. The analysis shows that if a DNN is invariant to such symmetries its generalization error may be much smaller than the generalization error of a non-invariant DNN. Finally, two novel regularization methods for DNNs motivated by the theoretical analysis are presented and their performance is evaluated on various datasets such as MNIST, CIFAR-10, ImageNet and LaRED. The thesis is concluded by a summary of contributions and discussion of possible extensions of the current work
    corecore