7,554 research outputs found
Sparse and Unique Nonnegative Matrix Factorization Through Data Preprocessing
Nonnegative matrix factorization (NMF) has become a very popular technique in
machine learning because it automatically extracts meaningful features through
a sparse and part-based representation. However, NMF has the drawback of being
highly ill-posed, that is, there typically exist many different but equivalent
factorizations. In this paper, we introduce a completely new way to obtaining
more well-posed NMF problems whose solutions are sparser. Our technique is
based on the preprocessing of the nonnegative input data matrix, and relies on
the theory of M-matrices and the geometric interpretation of NMF. This approach
provably leads to optimal and sparse solutions under the separability
assumption of Donoho and Stodden (NIPS, 2003), and, for rank-three matrices,
makes the number of exact factorizations finite. We illustrate the
effectiveness of our technique on several image datasets.Comment: 34 pages, 11 figure
Spectral Unmixing with Multiple Dictionaries
Spectral unmixing aims at recovering the spectral signatures of materials,
called endmembers, mixed in a hyperspectral or multispectral image, along with
their abundances. A typical assumption is that the image contains one pure
pixel per endmember, in which case spectral unmixing reduces to identifying
these pixels. Many fully automated methods have been proposed in recent years,
but little work has been done to allow users to select areas where pure pixels
are present manually or using a segmentation algorithm. Additionally, in a
non-blind approach, several spectral libraries may be available rather than a
single one, with a fixed number (or an upper or lower bound) of endmembers to
chose from each. In this paper, we propose a multiple-dictionary constrained
low-rank matrix approximation model that address these two problems. We propose
an algorithm to compute this model, dubbed M2PALS, and its performance is
discussed on both synthetic and real hyperspectral images
Rounding Sum-of-Squares Relaxations
We present a general approach to rounding semidefinite programming
relaxations obtained by the Sum-of-Squares method (Lasserre hierarchy). Our
approach is based on using the connection between these relaxations and the
Sum-of-Squares proof system to transform a *combining algorithm* -- an
algorithm that maps a distribution over solutions into a (possibly weaker)
solution -- into a *rounding algorithm* that maps a solution of the relaxation
to a solution of the original problem.
Using this approach, we obtain algorithms that yield improved results for
natural variants of three well-known problems:
1) We give a quasipolynomial-time algorithm that approximates the maximum of
a low degree multivariate polynomial with non-negative coefficients over the
Euclidean unit sphere. Beyond being of interest in its own right, this is
related to an open question in quantum information theory, and our techniques
have already led to improved results in this area (Brand\~{a}o and Harrow, STOC
'13).
2) We give a polynomial-time algorithm that, given a d dimensional subspace
of R^n that (almost) contains the characteristic function of a set of size n/k,
finds a vector in the subspace satisfying ,
where . Aside from being a natural relaxation, this
is also motivated by a connection to the Small Set Expansion problem shown by
Barak et al. (STOC 2012) and our results yield a certain improvement for that
problem.
3) We use this notion of L_4 vs. L_2 sparsity to obtain a polynomial-time
algorithm with substantially improved guarantees for recovering a planted
-sparse vector v in a random d-dimensional subspace of R^n. If v has mu n
nonzero coordinates, we can recover it with high probability whenever , improving for prior methods which
intrinsically required
Using Underapproximations for Sparse Nonnegative Matrix Factorization
Nonnegative Matrix Factorization consists in (approximately) factorizing a
nonnegative data matrix by the product of two low-rank nonnegative matrices. It
has been successfully applied as a data analysis technique in numerous domains,
e.g., text mining, image processing, microarray data analysis, collaborative
filtering, etc.
We introduce a novel approach to solve NMF problems, based on the use of an
underapproximation technique, and show its effectiveness to obtain sparse
solutions. This approach, based on Lagrangian relaxation, allows the resolution
of NMF problems in a recursive fashion. We also prove that the
underapproximation problem is NP-hard for any fixed factorization rank, using a
reduction of the maximum edge biclique problem in bipartite graphs.
We test two variants of our underapproximation approach on several standard
image datasets and show that they provide sparse part-based representations
with low reconstruction error. Our results are comparable and sometimes
superior to those obtained by two standard Sparse Nonnegative Matrix
Factorization techniques.Comment: Version 2 removed the section about convex reformulations, which was
not central to the development of our main results; added material to the
introduction; added a review of previous related work (section 2.3);
completely rewritten the last part (section 4) to provide extensive numerical
results supporting our claims. Accepted in J. of Pattern Recognitio
Dictionary-based Tensor Canonical Polyadic Decomposition
To ensure interpretability of extracted sources in tensor decomposition, we
introduce in this paper a dictionary-based tensor canonical polyadic
decomposition which enforces one factor to belong exactly to a known
dictionary. A new formulation of sparse coding is proposed which enables high
dimensional tensors dictionary-based canonical polyadic decomposition. The
benefits of using a dictionary in tensor decomposition models are explored both
in terms of parameter identifiability and estimation accuracy. Performances of
the proposed algorithms are evaluated on the decomposition of simulated data
and the unmixing of hyperspectral images
- …