183 research outputs found
A review on initialization methods for nonnegative matrix factorization: Towards omics data experiments
Nonnegative Matrix Factorization (NMF) has acquired a relevant role in the panorama of knowledge extraction, thanks to the peculiarity that non-negativity applies to both bases and weights, which allows meaningful interpretations and is consistent with the natural human part-based learning process. Nevertheless, most NMF algorithms are iterative, so initialization methods affect convergence behaviour, the quality of the final solution, and NMF performance in terms of the residual of the cost function. Studies on the impact of NMF initialization techniques have been conducted for text or image datasets, but very few considerations can be found in the literature when biological datasets are studied, even though NMFs have largely demonstrated their usefulness in better understanding biological mechanisms with omic datasets. This paper aims to present the state-of-the-art on NMF initialization schemes along with some initial considerations on the impact of initialization methods when microarrays (a simple instance of omic data) are evaluated with NMF mechanisms. Using a series of measures to qualitatively examine the biological information extracted by a given NMF scheme, it preliminary appears that some information (e.g., represented by genes) can be extracted regardless of the initialization scheme used
Using Underapproximations for Sparse Nonnegative Matrix Factorization
Nonnegative Matrix Factorization consists in (approximately) factorizing a
nonnegative data matrix by the product of two low-rank nonnegative matrices. It
has been successfully applied as a data analysis technique in numerous domains,
e.g., text mining, image processing, microarray data analysis, collaborative
filtering, etc.
We introduce a novel approach to solve NMF problems, based on the use of an
underapproximation technique, and show its effectiveness to obtain sparse
solutions. This approach, based on Lagrangian relaxation, allows the resolution
of NMF problems in a recursive fashion. We also prove that the
underapproximation problem is NP-hard for any fixed factorization rank, using a
reduction of the maximum edge biclique problem in bipartite graphs.
We test two variants of our underapproximation approach on several standard
image datasets and show that they provide sparse part-based representations
with low reconstruction error. Our results are comparable and sometimes
superior to those obtained by two standard Sparse Nonnegative Matrix
Factorization techniques.Comment: Version 2 removed the section about convex reformulations, which was
not central to the development of our main results; added material to the
introduction; added a review of previous related work (section 2.3);
completely rewritten the last part (section 4) to provide extensive numerical
results supporting our claims. Accepted in J. of Pattern Recognitio
- …