20,160 research outputs found
Using Underapproximations for Sparse Nonnegative Matrix Factorization
Nonnegative Matrix Factorization consists in (approximately) factorizing a
nonnegative data matrix by the product of two low-rank nonnegative matrices. It
has been successfully applied as a data analysis technique in numerous domains,
e.g., text mining, image processing, microarray data analysis, collaborative
filtering, etc.
We introduce a novel approach to solve NMF problems, based on the use of an
underapproximation technique, and show its effectiveness to obtain sparse
solutions. This approach, based on Lagrangian relaxation, allows the resolution
of NMF problems in a recursive fashion. We also prove that the
underapproximation problem is NP-hard for any fixed factorization rank, using a
reduction of the maximum edge biclique problem in bipartite graphs.
We test two variants of our underapproximation approach on several standard
image datasets and show that they provide sparse part-based representations
with low reconstruction error. Our results are comparable and sometimes
superior to those obtained by two standard Sparse Nonnegative Matrix
Factorization techniques.Comment: Version 2 removed the section about convex reformulations, which was
not central to the development of our main results; added material to the
introduction; added a review of previous related work (section 2.3);
completely rewritten the last part (section 4) to provide extensive numerical
results supporting our claims. Accepted in J. of Pattern Recognitio
Optimal low-rank approximations of Bayesian linear inverse problems
In the Bayesian approach to inverse problems, data are often informative,
relative to the prior, only on a low-dimensional subspace of the parameter
space. Significant computational savings can be achieved by using this subspace
to characterize and approximate the posterior distribution of the parameters.
We first investigate approximation of the posterior covariance matrix as a
low-rank update of the prior covariance matrix. We prove optimality of a
particular update, based on the leading eigendirections of the matrix pencil
defined by the Hessian of the negative log-likelihood and the prior precision,
for a broad class of loss functions. This class includes the F\"{o}rstner
metric for symmetric positive definite matrices, as well as the
Kullback-Leibler divergence and the Hellinger distance between the associated
distributions. We also propose two fast approximations of the posterior mean
and prove their optimality with respect to a weighted Bayes risk under
squared-error loss. These approximations are deployed in an offline-online
manner, where a more costly but data-independent offline calculation is
followed by fast online evaluations. As a result, these approximations are
particularly useful when repeated posterior mean evaluations are required for
multiple data sets. We demonstrate our theoretical results with several
numerical examples, including high-dimensional X-ray tomography and an inverse
heat conduction problem. In both of these examples, the intrinsic
low-dimensional structure of the inference problem can be exploited while
producing results that are essentially indistinguishable from solutions
computed in the full space
Multi-resolution Low-rank Tensor Formats
We describe a simple, black-box compression format for tensors with a
multiscale structure. By representing the tensor as a sum of compressed tensors
defined on increasingly coarse grids, we capture low-rank structures on each
grid-scale, and we show how this leads to an increase in compression for a
fixed accuracy. We devise an alternating algorithm to represent a given tensor
in the multiresolution format and prove local convergence guarantees. In two
dimensions, we provide examples that show that this approach can beat the
Eckart-Young theorem, and for dimensions higher than two, we achieve higher
compression than the tensor-train format on six real-world datasets. We also
provide results on the closedness and stability of the tensor format and
discuss how to perform common linear algebra operations on the level of the
compressed tensors.Comment: 29 pages, 9 figure
- …