535 research outputs found
Generalized Separable Nonnegative Matrix Factorization
Nonnegative matrix factorization (NMF) is a linear dimensionality technique
for nonnegative data with applications such as image analysis, text mining,
audio source separation and hyperspectral unmixing. Given a data matrix and
a factorization rank , NMF looks for a nonnegative matrix with
columns and a nonnegative matrix with rows such that .
NMF is NP-hard to solve in general. However, it can be computed efficiently
under the separability assumption which requires that the basis vectors appear
as data points, that is, that there exists an index set such that
. In this paper, we generalize the separability
assumption: We only require that for each rank-one factor for
, either for some or for
some . We refer to the corresponding problem as generalized separable NMF
(GS-NMF). We discuss some properties of GS-NMF and propose a convex
optimization model which we solve using a fast gradient method. We also propose
a heuristic algorithm inspired by the successive projection algorithm. To
verify the effectiveness of our methods, we compare them with several
state-of-the-art separable NMF algorithms on synthetic, document and image data
sets.Comment: 31 pages, 12 figures, 4 tables. We have added discussions about the
identifiability of the model, we have modified the first synthetic
experiment, we have clarified some aspects of the contributio
Document Clustering Based On Max-Correntropy Non-Negative Matrix Factorization
Nonnegative matrix factorization (NMF) has been successfully applied to many
areas for classification and clustering. Commonly-used NMF algorithms mainly
target on minimizing the distance or Kullback-Leibler (KL) divergence,
which may not be suitable for nonlinear case. In this paper, we propose a new
decomposition method by maximizing the correntropy between the original and the
product of two low-rank matrices for document clustering. This method also
allows us to learn the new basis vectors of the semantic feature space from the
data. To our knowledge, we haven't seen any work has been done by maximizing
correntropy in NMF to cluster high dimensional document data. Our experiment
results show the supremacy of our proposed method over other variants of NMF
algorithm on Reuters21578 and TDT2 databasets.Comment: International Conference of Machine Learning and Cybernetics (ICMLC)
201
- …