13 research outputs found
Joint Covariance Estimation with Mutual Linear Structure
We consider the problem of joint estimation of structured covariance
matrices. Assuming the structure is unknown, estimation is achieved using
heterogeneous training sets. Namely, given groups of measurements coming from
centered populations with different covariances, our aim is to determine the
mutual structure of these covariance matrices and estimate them. Supposing that
the covariances span a low dimensional affine subspace in the space of
symmetric matrices, we develop a new efficient algorithm discovering the
structure and using it to improve the estimation. Our technique is based on the
application of principal component analysis in the matrix space. We also derive
an upper performance bound of the proposed algorithm in the Gaussian scenario
and compare it with the Cramer-Rao lower bound. Numerical simulations are
presented to illustrate the performance benefits of the proposed method
Tyler's Covariance Matrix Estimator in Elliptical Models with Convex Structure
We address structured covariance estimation in elliptical distributions by
assuming that the covariance is a priori known to belong to a given convex set,
e.g., the set of Toeplitz or banded matrices. We consider the General Method of
Moments (GMM) optimization applied to robust Tyler's scatter M-estimator
subject to these convex constraints. Unfortunately, GMM turns out to be
non-convex due to the objective. Instead, we propose a new COCA estimator - a
convex relaxation which can be efficiently solved. We prove that the relaxation
is tight in the unconstrained case for a finite number of samples, and in the
constrained case asymptotically. We then illustrate the advantages of COCA in
synthetic simulations with structured compound Gaussian distributions. In these
examples, COCA outperforms competing methods such as Tyler's estimator and its
projection onto the structure set.Comment: arXiv admin note: text overlap with arXiv:1311.059
Diagonal and Low-Rank Matrix Decompositions, Correlation Matrices, and Ellipsoid Fitting
In this paper we establish links between, and new results for, three problems
that are not usually considered together. The first is a matrix decomposition
problem that arises in areas such as statistical modeling and signal
processing: given a matrix formed as the sum of an unknown diagonal matrix
and an unknown low rank positive semidefinite matrix, decompose into these
constituents. The second problem we consider is to determine the facial
structure of the set of correlation matrices, a convex set also known as the
elliptope. This convex body, and particularly its facial structure, plays a
role in applications from combinatorial optimization to mathematical finance.
The third problem is a basic geometric question: given points
(where ) determine whether there is a centered
ellipsoid passing \emph{exactly} through all of the points.
We show that in a precise sense these three problems are equivalent.
Furthermore we establish a simple sufficient condition on a subspace that
ensures any positive semidefinite matrix with column space can be
recovered from for any diagonal matrix using a convex
optimization-based heuristic known as minimum trace factor analysis. This
result leads to a new understanding of the structure of rank-deficient
correlation matrices and a simple condition on a set of points that ensures
there is a centered ellipsoid passing through them.Comment: 20 page
Learning permutation symmetries with gips in R
The study of hidden structures in data presents challenges in modern
statistics and machine learning. We introduce the package in R,
which identifies permutation subgroup symmetries in Gaussian vectors.
serves two main purposes: exploratory analysis in discovering
hidden permutation symmetries and estimating the covariance matrix under
permutation symmetry. It is competitive to canonical methods in dimensionality
reduction while providing a new interpretation of the results.
implements a novel Bayesian model selection procedure within Gaussian vectors
invariant under the permutation subgroup introduced in Graczyk, Ishi,
Ko{\l}odziejek, Massam, Annals of Statistics, 50 (3) (2022).Comment: 36 pages, 11 figure
Matrix compression along isogenic blocks
A matrix-compression algorithm is derived from a novel isogenic block decomposition for square matrices. The resulting compression and inflation operations possess strong functorial and spectral-permanence properties. The basic observation that Hadamard entrywise functional calculus preserves isogenic blocks has already proved to be of paramount importance for thresholding large correlation matrices. The proposed isogenic stratification of the set of complex matrices bears similarities to the Schubert cell stratification of a homogeneous algebraic manifold. An array of potential applications to current investigations in computational matrix analysis is briefly mentioned, touching concepts such as symmetric statistical models, hierarchical matrices and coherent matrix organization induced by partition trees
Robustness and invariance in the generalization error of deep neural networks
In recent years Deep Neural Networks (DNNs) have achieved state-of-the-art results in many fields such as speech recognition, computer vision and others. Despite their success in practice, many theoretical fundamentals of DNNs are still not clear. One of them is the generalization error of DNNs, which is the topic of this thesis. The thesis first reviews the theory and practice of DNNs focusing specifically on theoretical results that provide generalization error bounds. We argue that the current state-of-the-art theoretical results, which rely on the width and depth of deep neural networks, do not apply in many practical scenarios where the networks are very wide or very deep. A novel approach to the theoretical analysis of the generalization error of DNNs is proposed next. The proposed approach relies on the classification margin of the DNN and on the complexity of the data. As this result does not rely on the width or the depth of the network it provides a rationale behind the practical success of learning with very wide and deep neural networks. These results are then extended to learning problems where symmetries are present in the data. The analysis shows that if a DNN is invariant to such symmetries its generalization error may be much smaller than the generalization error of a non-invariant DNN. Finally, two novel regularization methods for DNNs motivated by the theoretical analysis are presented and their performance is evaluated on various datasets such as MNIST, CIFAR-10, ImageNet and LaRED. The thesis is concluded by a summary of contributions and discussion of possible extensions of the current work