3,770 research outputs found
Interpretable Low-Rank Document Representations with Label-Dependent Sparsity Patterns
In context of document classification, where in a corpus of documents their
label tags are readily known, an opportunity lies in utilizing label
information to learn document representation spaces with better discriminative
properties. To this end, in this paper application of a Variational Bayesian
Supervised Nonnegative Matrix Factorization (supervised vbNMF) with
label-driven sparsity structure of coefficients is proposed for learning of
discriminative nonsubtractive latent semantic components occuring in TF-IDF
document representations. Constraints are such that the components pursued are
made to be frequently occuring in a small set of labels only, making it
possible to yield document representations with distinctive label-specific
sparse activation patterns. A simple measure of quality of this kind of
sparsity structure, dubbed inter-label sparsity, is introduced and
experimentally brought into tight connection with classification performance.
Representing a great practical convenience, inter-label sparsity is shown to be
easily controlled in supervised vbNMF by a single parameter
- …