Search CORE

1,835 research outputs found

Coordinate Descent Methods for Symmetric Nonnegative Matrix Factorization

Author: Dhillon Inderjit
Gillis Nicolas
Lei Qi
Vandaele Arnaud
Zhong Kai
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 31/05/2016
Field of study

Given a symmetric nonnegative matrix

A

, symmetric nonnegative matrix factorization (symNMF) is the problem of finding a nonnegative matrix

H

, usually with much fewer columns than

A

, such that

A \approx HH^T

. SymNMF can be used for data analysis and in particular for various clustering tasks. In this paper, we propose simple and very efficient coordinate descent schemes to solve this problem, and that can handle large and sparse input matrices. The effectiveness of our methods is illustrated on synthetic and real-world data sets, and we show that they perform favorably compared to recent state-of-the-art methods.Comment: 25 pages, 5 figures, 7 tables. Main changes: comparison with another symNMF algorithm (namely, BetaSNMF), and correction of an error in the convergence proo

arXiv.org e-Print Archive

Dropping Symmetry for Fast Symmetric Nonnegative Matrix Factorization

Author: Li Qiuwei
Li Xiao
Liu Kai
Zhu Zhihui
Publication venue
Publication date: 14/11/2018
Field of study

Symmetric nonnegative matrix factorization (NMF), a special but important class of the general NMF, is demonstrated to be useful for data analysis and in particular for various clustering tasks. Unfortunately, designing fast algorithms for Symmetric NMF is not as easy as for the nonsymmetric counterpart, the latter admitting the splitting property that allows efficient alternating-type algorithms. To overcome this issue, we transfer the symmetric NMF to a nonsymmetric one, then we can adopt the idea from the state-of-the-art algorithms for nonsymmetric NMF to design fast algorithms solving symmetric NMF. We rigorously establish that solving nonsymmetric reformulation returns a solution for symmetric NMF and then apply fast alternating based algorithms for the corresponding reformulated problem. Furthermore, we show these fast algorithms admit strong convergence guarantee in the sense that the generated sequence is convergent at least at a sublinear rate and it converges globally to a critical point of the symmetric NMF. We conduct experiments on both synthetic data and image clustering to support our result.Comment: Accepted in NIPS 201

arXiv.org e-Print Archive

Algorithms for Positive Semidefinite Factorization

Author: Gillis Nicolas
Glineur François
Vandaele Arnaud
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/07/2017
Field of study

This paper considers the problem of positive semidefinite factorization (PSD factorization), a generalization of exact nonnegative matrix factorization. Given an

m

-by-

n

nonnegative matrix

X

and an integer

k

, the PSD factorization problem consists in finding, if possible, symmetric

k

-by-

k

positive semidefinite matrices

\{A^1,...,A^m\}

and

\{B^1,...,B^n\}

such that

X_{i,j}=\text{trace}(A^iB^j)

for

i=1,...,m

, and

j=1,...n

. PSD factorization is NP-hard. In this work, we introduce several local optimization schemes to tackle this problem: a fast projected gradient method and two algorithms based on the coordinate descent framework. The main application of PSD factorization is the computation of semidefinite extensions, that is, the representations of polyhedrons as projections of spectrahedra, for which the matrix to be factorized is the slack matrix of the polyhedron. We compare the performance of our algorithms on this class of problems. In particular, we compute the PSD extensions of size

k=1+ \lceil \log_2(n) \rceil

for the regular

n

-gons when

n=5

8

and

10

. We also show how to generalize our algorithms to compute the square root rank (which is the size of the factors in a PSD factorization where all factor matrices

A^i

and

B^j

have rank one) and completely PSD factorizations (which is the special case where the input matrix is symmetric and equality

A^i=B^i

is required for all

i

).Comment: 21 pages, 3 figures, 3 table

arXiv.org e-Print Archive

Nonnegative Matrix Factorization for Signal and Data Analytics: Identifiability, Algorithms, and Applications

Author: Fu Xiao
Huang Kejun
Ma Wing-Kin
Sidiropoulos Nicholas D.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/11/2018
Field of study

Nonnegative matrix factorization (NMF) has become a workhorse for signal and data analytics, triggered by its model parsimony and interpretability. Perhaps a bit surprisingly, the understanding to its model identifiability---the major reason behind the interpretability in many applications such as topic mining and hyperspectral imaging---had been rather limited until recent years. Beginning from the 2010s, the identifiability research of NMF has progressed considerably: Many interesting and important results have been discovered by the signal processing (SP) and machine learning (ML) communities. NMF identifiability has a great impact on many aspects in practice, such as ill-posed formulation avoidance and performance-guaranteed algorithm design. On the other hand, there is no tutorial paper that introduces NMF from an identifiability viewpoint. In this paper, we aim at filling this gap by offering a comprehensive and deep tutorial on model identifiability of NMF as well as the connections to algorithms and applications. This tutorial will help researchers and graduate students grasp the essence and insights of NMF, thereby avoiding typical `pitfalls' that are often times due to unidentifiable NMF formulations. This paper will also help practitioners pick/design suitable factorization tools for their own problems.Comment: accepted version, IEEE Signal Processing Magazine; supplementary materials added. Some minor revisions implemente

arXiv.org e-Print Archive

A Symmetric Rank-one Quasi Newton Method for Non-negative Matrix Factorization

Author: Lai Shu-Zhen
Li Hou-Biao
Zhang Zu-Tao
Publication venue
Publication date: 24/05/2013
Field of study

As we all known, the nonnegative matrix factorization (NMF) is a dimension reduction method that has been widely used in image processing, text compressing and signal processing etc. In this paper, an algorithm for nonnegative matrix approximation is proposed. This method mainly bases on the active set and the quasi-Newton type algorithm, by using the symmetric rank-one and negative curvature direction technologies to approximate the Hessian matrix. Our method improves the recent results of those methods in [Pattern Recognition, 45(2012)3557-3565; SIAM J. Sci. Comput., 33(6)(2011)3261-3281; Neural Computation, 19(10)(2007)2756-2779, etc.]. Moreover, the object function decreases faster than many other NMF methods. In addition, some numerical experiments are presented in the synthetic data, imaging processing and text clustering. By comparing with the other six nonnegative matrix approximation methods, our experiments confirm to our analysis.Comment: 19 pages, 13 figures, Submitted to PP on Feb. 5, 201

arXiv.org e-Print Archive

Directory of Open Access Journals

A Nonconvex Splitting Method for Symmetric Nonnegative Matrix Factorization: Convergence Analysis and Optimality

Author: Hong Mingyi
Lu Songtao
Wang Zhengdao
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/03/2017
Field of study

Symmetric nonnegative matrix factorization (SymNMF) has important applications in data analytics problems such as document clustering, community detection and image segmentation. In this paper, we propose a novel nonconvex variable splitting method for solving SymNMF. The proposed algorithm is guaranteed to converge to the set of Karush-Kuhn-Tucker (KKT) points of the nonconvex SymNMF problem. Furthermore, it achieves a global sublinear convergence rate. We also show that the algorithm can be efficiently implemented in parallel. Further, sufficient conditions are provided which guarantee the global and local optimality of the obtained solutions. Extensive numerical results performed on both synthetic and real data sets suggest that the proposed algorithm converges quickly to a local minimum solution.Comment: IEEE Transactions on Signal Processing (to appear

arXiv.org e-Print Archive

The Why and How of Nonnegative Matrix Factorization

Author: Gillis Nicolas
Publication venue
Publication date: 07/03/2014
Field of study

Nonnegative matrix factorization (NMF) has become a widely used tool for the analysis of high-dimensional data as it automatically extracts sparse and meaningful features from a set of nonnegative data vectors. We first illustrate this property of NMF on three applications, in image processing, text mining and hyperspectral imaging --this is the why. Then we address the problem of solving NMF, which is NP-hard in general. We review some standard NMF algorithms, and also present a recent subclass of NMF problems, referred to as near-separable NMF, that can be solved efficiently (that is, in polynomial time), even in the presence of noise --this is the how. Finally, we briefly describe some problems in mathematics and computer science closely related to NMF via the nonnegative rank.Comment: 25 pages, 5 figures. Some typos and errors corrected, Section 3.2 reorganize

arXiv.org e-Print Archive

Microbial community pattern detection in human body habitats via ensemble clustering framework

Author: Chua Hon-Nian
Li Xiao-Li
Ning Kang
Ou-Yang Le
Su Xiaoquan
Yang Peng
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

The human habitat is a host where microbial species evolve, function, and continue to evolve. Elucidating how microbial communities respond to human habitats is a fundamental and critical task, as establishing baselines of human microbiome is essential in understanding its role in human disease and health. However, current studies usually overlook a complex and interconnected landscape of human microbiome and limit the ability in particular body habitats with learning models of specific criterion. Therefore, these methods could not capture the real-world underlying microbial patterns effectively. To obtain a comprehensive view, we propose a novel ensemble clustering framework to mine the structure of microbial community pattern on large-scale metagenomic data. Particularly, we first build a microbial similarity network via integrating 1920 metagenomic samples from three body habitats of healthy adults. Then a novel symmetric Nonnegative Matrix Factorization (NMF) based ensemble model is proposed and applied onto the network to detect clustering pattern. Extensive experiments are conducted to evaluate the effectiveness of our model on deriving microbial community with respect to body habitat and host gender. From clustering results, we observed that body habitat exhibits a strong bound but non-unique microbial structural patterns. Meanwhile, human microbiome reveals different degree of structural variations over body habitat and host gender. In summary, our ensemble clustering framework could efficiently explore integrated clustering results to accurately identify microbial communities, and provide a comprehensive view for a set of microbial communities. Such trends depict an integrated biography of microbial communities, which offer a new insight towards uncovering pathogenic model of human microbiome.Comment: BMC Systems Biology 201

arXiv.org e-Print Archive

Accelerated Parallel and Distributed Algorithm using Limited Internal Memory for Nonnegative Matrix Factorization

Author: Ho Tu-Bao
Nguyen Duy-Khuong
Publication venue
Publication date: 30/06/2015
Field of study

Nonnegative matrix factorization (NMF) is a powerful technique for dimension reduction, extracting latent factors and learning part-based representation. For large datasets, NMF performance depends on some major issues: fast algorithms, fully parallel distributed feasibility and limited internal memory. This research aims to design a fast fully parallel and distributed algorithm using limited internal memory to reach high NMF performance for large datasets. In particular, we propose a flexible accelerated algorithm for NMF with all its

L_1

L_2

regularized variants based on full decomposition, which is a combination of an anti-lopsided algorithm and a fast block coordinate descent algorithm. The proposed algorithm takes advantages of both these algorithms to achieve a linear convergence rate of

\mathcal{O}(1-\frac{1}{||Q||_2})^k

in optimizing each factor matrix when fixing the other factor one in the sub-space of passive variables, where

r

is the number of latent components; where

\sqrt{r} \leq ||Q||_2 \leq r

. In addition, the algorithm can exploit the data sparseness to run on large datasets with limited internal memory of machines. Furthermore, our experimental results are highly competitive with 7 state-of-the-art methods about three significant aspects of convergence, optimality and average of the iteration number. Therefore, the proposed algorithm is superior to fast block coordinate descent methods and accelerated methods

arXiv.org e-Print Archive

Introduction to Nonnegative Matrix Factorization

Author: Gillis Nicolas
Publication venue
Publication date: 02/03/2017
Field of study

In this paper, we introduce and provide a short overview of nonnegative matrix factorization (NMF). Several aspects of NMF are discussed, namely, the application in hyperspectral imaging, geometry and uniqueness of NMF solutions, complexity, algorithms, and its link with extended formulations of polyhedra. In order to put NMF into perspective, the more general problem class of constrained low-rank matrix approximation problems is first briefly introduced.Comment: 18 pages, 4 figure

arXiv.org e-Print Archive