Search CORE

1,040 research outputs found

Computing a Nonnegative Matrix Factorization -- Provably

Author: Arora Sanjeev
Ge Rong
Kannan Ravi
Moitra Ankur
Publication venue
Publication date: 03/11/2011
Field of study

In the Nonnegative Matrix Factorization (NMF) problem we are given an

n \times m

nonnegative matrix

M

and an integer

r > 0

. Our goal is to express

M

A W

where

A

and

W

are nonnegative matrices of size

n \times r

and

r \times m

respectively. In some applications, it makes sense to ask instead for the product

AW

to approximate

M

-- i.e. (approximately) minimize \norm{M - AW}_F where \norm{}_F denotes the Frobenius norm; we refer to this as Approximate NMF. This problem has a rich history spanning quantum mechanics, probability theory, data analysis, polyhedral combinatorics, communication complexity, demography, chemometrics, etc. In the past decade NMF has become enormously popular in machine learning, where

A

and

W

are computed using a variety of local search heuristics. Vavasis proved that this problem is NP-complete. We initiate a study of when this problem is solvable in polynomial time: 1. We give a polynomial-time algorithm for exact and approximate NMF for every constant

r

. Indeed NMF is most interesting in applications precisely when

r

is small. 2. We complement this with a hardness result, that if exact NMF can be solved in time

(nm)^{o(r)}

, 3-SAT has a sub-exponential time algorithm. This rules out substantial improvements to the above algorithm. 3. We give an algorithm that runs in time polynomial in

n

m

and

r

under the separablity condition identified by Donoho and Stodden in 2003. The algorithm may be practical since it is simple and noise tolerant (under benign assumptions). Separability is believed to hold in many practical settings. To the best of our knowledge, this last result is the first example of a polynomial-time algorithm that provably works under a non-trivial condition on the input and we believe that this will be an interesting and important direction for future work.Comment: 29 pages, 3 figure

arXiv.org e-Print Archive

CiteSeerX

Princeton University Open Access Repository

Robustness Analysis of Hottopixx, a Linear Programming Model for Factoring Nonnegative Matrices

Author: Gillis Nicolas
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2013
Field of study

Although nonnegative matrix factorization (NMF) is NP-hard in general, it has been shown very recently that it is tractable under the assumption that the input nonnegative data matrix is close to being separable (separability requires that all columns of the input matrix belongs to the cone spanned by a small subset of these columns). Since then, several algorithms have been designed to handle this subclass of NMF problems. In particular, Bittorf, Recht, R\'e and Tropp (`Factoring nonnegative matrices with linear programs', NIPS 2012) proposed a linear programming model, referred to as Hottopixx. In this paper, we provide a new and more general robustness analysis of their method. In particular, we design a provably more robust variant using a post-processing strategy which allows us to deal with duplicates and near duplicates in the dataset.Comment: 23 pages; new numerical results; Comparison with Arora et al.; Accepted in SIAM J. Mat. Anal. App

arXiv.org e-Print Archive

CiteSeerX

DIAL UCLouvain

Intersecting Faces: Non-negative Matrix Factorization With New Guarantees

Author: Ge Rong
Zou James
Publication venue
Publication date: 01/01/2015
Field of study

Non-negative matrix factorization (NMF) is a natural model of admixture and is widely used in science and engineering. A plethora of algorithms have been developed to tackle NMF, but due to the non-convex nature of the problem, there is little guarantee on how well these methods work. Recently a surge of research have focused on a very restricted class of NMFs, called separable NMF, where provably correct algorithms have been developed. In this paper, we propose the notion of subset-separable NMF, which substantially generalizes the property of separability. We show that subset-separability is a natural necessary condition for the factorization to be unique or to have minimum volume. We developed the Face-Intersect algorithm which provably and efficiently solves subset-separable NMF under natural conditions, and we prove that our algorithm is robust to small noise. We explored the performance of Face-Intersect on simulations and discuss settings where it empirically outperformed the state-of-art methods. Our work is a step towards finding provably correct algorithms that solve large classes of NMF problems

arXiv.org e-Print Archive

CiteSeerX

Factoring nonnegative matrices with linear programs

Author: Bittorf Victor
Re Christopher
Recht Benjamin
Tropp Joel A.
Publication venue
Publication date: 01/01/2012
Field of study

This paper describes a new approach, based on linear programming, for computing nonnegative matrix factorizations (NMFs). The key idea is a data-driven model for the factorization where the most salient features in the data are used to express the remaining features. More precisely, given a data matrix X, the algorithm identifies a matrix C such that X approximately equals CX and some linear constraints. The constraints are chosen to ensure that the matrix C selects features; these features can then be used to find a low-rank NMF of X. A theoretical analysis demonstrates that this approach has guarantees similar to those of the recent NMF algorithm of Arora et al. (2012). In contrast with this earlier work, the proposed method extends to more general noise models and leads to efficient, scalable algorithms. Experiments with synthetic and real datasets provide evidence that the new approach is also superior in practice. An optimized C++ implementation can factor a multigigabyte matrix in a matter of minutes.Comment: 17 pages, 10 figures. Modified theorem statement for robust recovery conditions. Revised proof techniques to make arguments more elementary. Results on robustness when rows are duplicated have been superseded by arxiv.org/1211.668

arXiv.org e-Print Archive

Caltech Authors

A Fast Gradient Method for Nonnegative Sparse Regression with Self Dictionary

Author: Gillis Nicolas
Luce Robert
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/08/2017
Field of study

A nonnegative matrix factorization (NMF) can be computed efficiently under the separability assumption, which asserts that all the columns of the given input data matrix belong to the cone generated by a (small) subset of them. The provably most robust methods to identify these conic basis columns are based on nonnegative sparse regression and self dictionaries, and require the solution of large-scale convex optimization problems. In this paper we study a particular nonnegative sparse regression model with self dictionary. As opposed to previously proposed models, this model yields a smooth optimization problem where the sparsity is enforced through linear constraints. We show that the Euclidean projection on the polyhedron defined by these constraints can be computed efficiently, and propose a fast gradient method to solve our model. We compare our algorithm with several state-of-the-art methods on synthetic data sets and real-world hyperspectral images

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne