11,392 research outputs found
Topic supervised non-negative matrix factorization
Topic models have been extensively used to organize and interpret the
contents of large, unstructured corpora of text documents. Although topic
models often perform well on traditional training vs. test set evaluations, it
is often the case that the results of a topic model do not align with human
interpretation. This interpretability fallacy is largely due to the
unsupervised nature of topic models, which prohibits any user guidance on the
results of a model. In this paper, we introduce a semi-supervised method called
topic supervised non-negative matrix factorization (TS-NMF) that enables the
user to provide labeled example documents to promote the discovery of more
meaningful semantic structure of a corpus. In this way, the results of TS-NMF
better match the intuition and desired labeling of the user. The core of TS-NMF
relies on solving a non-convex optimization problem for which we derive an
iterative algorithm that is shown to be monotonic and convergent to a local
optimum. We demonstrate the practical utility of TS-NMF on the Reuters and
PubMed corpora, and find that TS-NMF is especially useful for conceptual or
broad topics, where topic key terms are not well understood. Although
identifying an optimal latent structure for the data is not a primary objective
of the proposed approach, we find that TS-NMF achieves higher weighted Jaccard
similarity scores than the contemporary methods, (unsupervised) NMF and latent
Dirichlet allocation, at supervision rates as low as 10% to 20%
Non-negative matrix factorization with sparseness constraints
Non-negative matrix factorization (NMF) is a recently developed technique for
finding parts-based, linear representations of non-negative data. Although it
has successfully been applied in several applications, it does not always
result in parts-based representations. In this paper, we show how explicitly
incorporating the notion of `sparseness' improves the found decompositions.
Additionally, we provide complete MATLAB code both for standard NMF and for our
extension. Our hope is that this will further the application of these methods
to solving novel data-analysis problems
Non-negative matrix factorization for medical imaging
A non-negative matrix factorization approach to dimensionality reduction is proposed to aid classification of images. The original images can be stored as lower-dimensional columns of a matrix that hold degrees of belonging to feature components, so they can be used in the training phase of the classification at lower runtime and without loss in accuracy. The extracted features can be visually examined and images reconstructed with limited error. The proof of concept is performed on a benchmark of handwritten digits, followed by the application to histopathological colorectal cancer slides. Results are encouraging, though dealing with real-world medical data raises a number of issues.Universidad de Málaga. Campus de Excelencia Internacional AndalucĂa Tec
Intersecting Faces: Non-negative Matrix Factorization With New Guarantees
Non-negative matrix factorization (NMF) is a natural model of admixture and
is widely used in science and engineering. A plethora of algorithms have been
developed to tackle NMF, but due to the non-convex nature of the problem, there
is little guarantee on how well these methods work. Recently a surge of
research have focused on a very restricted class of NMFs, called separable NMF,
where provably correct algorithms have been developed. In this paper, we
propose the notion of subset-separable NMF, which substantially generalizes the
property of separability. We show that subset-separability is a natural
necessary condition for the factorization to be unique or to have minimum
volume. We developed the Face-Intersect algorithm which provably and
efficiently solves subset-separable NMF under natural conditions, and we prove
that our algorithm is robust to small noise. We explored the performance of
Face-Intersect on simulations and discuss settings where it empirically
outperformed the state-of-art methods. Our work is a step towards finding
provably correct algorithms that solve large classes of NMF problems
- …