4,566 research outputs found
Document Clustering Based On Max-Correntropy Non-Negative Matrix Factorization
Nonnegative matrix factorization (NMF) has been successfully applied to many
areas for classification and clustering. Commonly-used NMF algorithms mainly
target on minimizing the distance or Kullback-Leibler (KL) divergence,
which may not be suitable for nonlinear case. In this paper, we propose a new
decomposition method by maximizing the correntropy between the original and the
product of two low-rank matrices for document clustering. This method also
allows us to learn the new basis vectors of the semantic feature space from the
data. To our knowledge, we haven't seen any work has been done by maximizing
correntropy in NMF to cluster high dimensional document data. Our experiment
results show the supremacy of our proposed method over other variants of NMF
algorithm on Reuters21578 and TDT2 databasets.Comment: International Conference of Machine Learning and Cybernetics (ICMLC)
201
Graph Regularized Non-negative Matrix Factorization By Maximizing Correntropy
Non-negative matrix factorization (NMF) has proved effective in many
clustering and classification tasks. The classic ways to measure the errors
between the original and the reconstructed matrix are distance or
Kullback-Leibler (KL) divergence. However, nonlinear cases are not properly
handled when we use these error measures. As a consequence, alternative
measures based on nonlinear kernels, such as correntropy, are proposed.
However, the current correntropy-based NMF only targets on the low-level
features without considering the intrinsic geometrical distribution of data. In
this paper, we propose a new NMF algorithm that preserves local invariance by
adding graph regularization into the process of max-correntropy-based matrix
factorization. Meanwhile, each feature can learn corresponding kernel from the
data. The experiment results of Caltech101 and Caltech256 show the benefits of
such combination against other NMF algorithms for the unsupervised image
clustering
Four algorithms to solve symmetric multi-type non-negative matrix tri-factorization problem
In this paper, we consider the symmetric multi-type non-negative matrix
tri-factorization problem (SNMTF), which attempts to factorize several
symmetric non-negative matrices simultaneously. This can be considered as a
generalization of the classical non-negative matrix tri-factorization problem
and includes a non-convex objective function which is a multivariate sixth
degree polynomial and a has convex feasibility set. It has a special importance
in data science, since it serves as a mathematical model for the fusion of
different data sources in data clustering.
We develop four methods to solve the SNMTF. They are based on four
theoretical approaches known from the literature: the fixed point method (FPM),
the block-coordinate descent with projected gradient (BCD), the gradient method
with exact line search (GM-ELS) and the adaptive moment estimation method
(ADAM). For each of these methods we offer a software implementation: for the
former two methods we use Matlab and for the latter Python with the TensorFlow
library.
We test these methods on three data-sets: the synthetic data-set we
generated, while the others represent real-life similarities between different
objects.
Extensive numerical results show that with sufficient computing time all four
methods perform satisfactorily and ADAM most often yields the best mean square
error (). However, if the computation time is limited, FPM gives
the best because it shows the fastest convergence at the
beginning.
All data-sets and codes are publicly available on our GitLab profile
An Oracle Inequality for Quasi-Bayesian Non-Negative Matrix Factorization
The aim of this paper is to provide some theoretical understanding of
quasi-Bayesian aggregation methods non-negative matrix factorization. We derive
an oracle inequality for an aggregated estimator. This result holds for a very
general class of prior distributions and shows how the prior affects the rate
of convergence.Comment: This is the corrected version of the published paper P. Alquier, B.
Guedj, An Oracle Inequality for Quasi-Bayesian Non-negative Matrix
Factorization, Mathematical Methods of Statistics, 2017, vol. 26, no. 1, pp.
55-67. Since then Arnak Dalalyan (ENSAE) found a mistake in the proofs. We
fixed the mistake at the price of a slightly different logarithmic term in
the boun
- …