2 research outputs found
Feature Weighted Non-negative Matrix Factorization
Non-negative Matrix Factorization (NMF) is one of the most popular techniques
for data representation and clustering, and has been widely used in machine
learning and data analysis. NMF concentrates the features of each sample into a
vector, and approximates it by the linear combination of basis vectors, such
that the low-dimensional representations are achieved. However, in real-world
applications, the features are usually with different importances. To exploit
the discriminative features, some methods project the samples into the subspace
with a transformation matrix, which disturbs the original feature attributes
and neglects the diversity of samples. To alleviate the above problems, we
propose the Feature weighted Non-negative Matrix Factorization (FNMF) in this
paper. The salient properties of FNMF can be summarized as threefold: 1) it
learns the weights of features adaptively according to their importances; 2) it
utilizes multiple feature weighting components to preserve the diversity; 3) it
can be solved efficiently with the suggested optimization algorithm.
Performance on synthetic and real-world datasets demonstrate that the proposed
method obtains the state-of-the-art performance
Entropy Minimizing Matrix Factorization
Nonnegative Matrix Factorization (NMF) is a widely-used data analysis
technique, and has yielded impressive results in many real-world tasks.
Generally, existing NMF methods represent each sample with several centroids,
and find the optimal centroids by minimizing the sum of the approximation
errors. However, the outliers deviating from the normal data distribution may
have large residues, and then dominate the objective value seriously. In this
study, an Entropy Minimizing Matrix Factorization framework (EMMF) is developed
to tackle the above problem. Considering that the outliers are usually much
less than the normal samples, a new entropy loss function is established for
matrix factorization, which minimizes the entropy of the residue distribution
and allows a few samples to have large approximation errors. In this way, the
outliers do not affect the approximation of the normal samples. The
multiplicative updating rules for EMMF are also designed, and the convergence
is proved both theoretically and experimentally. In addition, a Graph
regularized version of EMMF (G-EMMF) is also presented to deal with the complex
data structure. Clustering results on various synthetic and real-world datasets
demonstrate the reasonableness of the proposed models, and the effectiveness is
also verified through the comparison with the state-of-the-arts