112 research outputs found
Computing a Nonnegative Matrix Factorization -- Provably
In the Nonnegative Matrix Factorization (NMF) problem we are given an nonnegative matrix and an integer . Our goal is to express
as where and are nonnegative matrices of size
and respectively. In some applications, it makes sense to ask
instead for the product to approximate -- i.e. (approximately)
minimize \norm{M - AW}_F where \norm{}_F denotes the Frobenius norm; we
refer to this as Approximate NMF. This problem has a rich history spanning
quantum mechanics, probability theory, data analysis, polyhedral combinatorics,
communication complexity, demography, chemometrics, etc. In the past decade NMF
has become enormously popular in machine learning, where and are
computed using a variety of local search heuristics. Vavasis proved that this
problem is NP-complete. We initiate a study of when this problem is solvable in
polynomial time:
1. We give a polynomial-time algorithm for exact and approximate NMF for
every constant . Indeed NMF is most interesting in applications precisely
when is small.
2. We complement this with a hardness result, that if exact NMF can be solved
in time , 3-SAT has a sub-exponential time algorithm. This rules
out substantial improvements to the above algorithm.
3. We give an algorithm that runs in time polynomial in , and
under the separablity condition identified by Donoho and Stodden in 2003. The
algorithm may be practical since it is simple and noise tolerant (under benign
assumptions). Separability is believed to hold in many practical settings.
To the best of our knowledge, this last result is the first example of a
polynomial-time algorithm that provably works under a non-trivial condition on
the input and we believe that this will be an interesting and important
direction for future work.Comment: 29 pages, 3 figure
Generalized Separable Nonnegative Matrix Factorization
Nonnegative matrix factorization (NMF) is a linear dimensionality technique
for nonnegative data with applications such as image analysis, text mining,
audio source separation and hyperspectral unmixing. Given a data matrix and
a factorization rank , NMF looks for a nonnegative matrix with
columns and a nonnegative matrix with rows such that .
NMF is NP-hard to solve in general. However, it can be computed efficiently
under the separability assumption which requires that the basis vectors appear
as data points, that is, that there exists an index set such that
. In this paper, we generalize the separability
assumption: We only require that for each rank-one factor for
, either for some or for
some . We refer to the corresponding problem as generalized separable NMF
(GS-NMF). We discuss some properties of GS-NMF and propose a convex
optimization model which we solve using a fast gradient method. We also propose
a heuristic algorithm inspired by the successive projection algorithm. To
verify the effectiveness of our methods, we compare them with several
state-of-the-art separable NMF algorithms on synthetic, document and image data
sets.Comment: 31 pages, 12 figures, 4 tables. We have added discussions about the
identifiability of the model, we have modified the first synthetic
experiment, we have clarified some aspects of the contributio
Coordinate Descent for Mixed-norm NMF
Nonnegative matrix factorization (NMF) is widely used in a variety of machine learning tasks
involving speech, documents and images. Being able to specify the structure of the matrix factors
is crucial in incorporating prior information. The factors correspond to the feature matrix and
the learnt representation. In particular, we allow an user-friendly specification of sparsity on the
groups of features using the L1/L2 measure. Also, we propose a pairwise coordinate descent
algorithm to minimize the objective. Experimental evidence of the efficacy of this approach is
provided on the ORL faces dataset
Computing approximate PSD factorizations
We give an algorithm for computing approximate PSD factorizations of
nonnegative matrices. The running time of the algorithm is polynomial in the
dimensions of the input matrix, but exponential in the PSD rank and the
approximation error. The main ingredient is an exact factorization algorithm
when the rows and columns of the factors are constrained to lie in a general
polyhedron. This strictly generalizes nonnegative matrix factorizations which
can be captured by letting this polyhedron to be the nonnegative orthant.Comment: 10 page
Using separable non-negative matrix factorization techniques for the analysis of time-resolved Raman spectra
The key challenge of time-resolved Raman spectroscopy is the identification
of the constituent species and the analysis of the kinetics of the underlying
reaction network. In this work we present an integral approach that allows for
determining both the component spectra and the rate constants simultaneously
from a series of vibrational spectra. It is based on an algorithm for
non-negative matrix factorization which is applied to the experimental data set
following a few pre-processing steps. As a prerequisite for physically
unambiguous solutions, each component spectrum must include one vibrational
band that does not significantly interfere with vibrational bands of other
species. The approach is applied to synthetic "experimental" spectra derived
from model systems comprising a set of species with component spectra differing
with respect to their degree of spectral interferences and signal-to-noise
ratios. In each case, the species involved are connected via monomolecular
reaction pathways. The potential and limitations of the approach for recovering
the respective rate constants and component spectra are discussed
Fast Conical Hull Algorithms for Near-separable Non-negative Matrix Factorization
The separability assumption (Donoho & Stodden, 2003; Arora et al., 2012)
turns non-negative matrix factorization (NMF) into a tractable problem.
Recently, a new class of provably-correct NMF algorithms have emerged under
this assumption. In this paper, we reformulate the separable NMF problem as
that of finding the extreme rays of the conical hull of a finite set of
vectors. From this geometric perspective, we derive new separable NMF
algorithms that are highly scalable and empirically noise robust, and have
several other favorable properties in relation to existing methods. A parallel
implementation of our algorithm demonstrates high scalability on shared- and
distributed-memory machines.Comment: 15 pages, 6 figure
- …