18,802 research outputs found
F-measure Maximization in Multi-Label Classification with Conditionally Independent Label Subsets
We discuss a method to improve the exact F-measure maximization algorithm
called GFM, proposed in (Dembczynski et al. 2011) for multi-label
classification, assuming the label set can be can partitioned into
conditionally independent subsets given the input features. If the labels were
all independent, the estimation of only parameters ( denoting the number
of labels) would suffice to derive Bayes-optimal predictions in
operations. In the general case, parameters are required by GFM, to
solve the problem in operations. In this work, we show that the number
of parameters can be reduced further to , in the best case, assuming the
label set can be partitioned into conditionally independent subsets. As
this label partition needs to be estimated from the data beforehand, we use
first the procedure proposed in (Gasse et al. 2015) that finds such partition
and then infer the required parameters locally in each label subset. The latter
are aggregated and serve as input to GFM to form the Bayes-optimal prediction.
We show on a synthetic experiment that the reduction in the number of
parameters brings about significant benefits in terms of performance
Hyperbolic Interaction Model For Hierarchical Multi-Label Classification
Different from the traditional classification tasks which assume mutual
exclusion of labels, hierarchical multi-label classification (HMLC) aims to
assign multiple labels to every instance with the labels organized under
hierarchical relations. Besides the labels, since linguistic ontologies are
intrinsic hierarchies, the conceptual relations between words can also form
hierarchical structures. Thus it can be a challenge to learn mappings from word
hierarchies to label hierarchies. We propose to model the word and label
hierarchies by embedding them jointly in the hyperbolic space. The main reason
is that the tree-likeness of the hyperbolic space matches the complexity of
symbolic data with hierarchical structures. A new Hyperbolic Interaction Model
(HyperIM) is designed to learn the label-aware document representations and
make predictions for HMLC. Extensive experiments are conducted on three
benchmark datasets. The results have demonstrated that the new model can
realistically capture the complex data structures and further improve the
performance for HMLC comparing with the state-of-the-art methods. To facilitate
future research, our code is publicly available
Multi-Label Learning with Label Enhancement
The task of multi-label learning is to predict a set of relevant labels for
the unseen instance. Traditional multi-label learning algorithms treat each
class label as a logical indicator of whether the corresponding label is
relevant or irrelevant to the instance, i.e., +1 represents relevant to the
instance and -1 represents irrelevant to the instance. Such label represented
by -1 or +1 is called logical label. Logical label cannot reflect different
label importance. However, for real-world multi-label learning problems, the
importance of each possible label is generally different. For the real
applications, it is difficult to obtain the label importance information
directly. Thus we need a method to reconstruct the essential label importance
from the logical multilabel data. To solve this problem, we assume that each
multi-label instance is described by a vector of latent real-valued labels,
which can reflect the importance of the corresponding labels. Such label is
called numerical label. The process of reconstructing the numerical labels from
the logical multi-label data via utilizing the logical label information and
the topological structure in the feature space is called Label Enhancement. In
this paper, we propose a novel multi-label learning framework called LEMLL,
i.e., Label Enhanced Multi-Label Learning, which incorporates regression of the
numerical labels and label enhancement into a unified framework. Extensive
comparative studies validate that the performance of multi-label learning can
be improved significantly with label enhancement and LEMLL can effectively
reconstruct latent label importance information from logical multi-label data.Comment: ICDM 201
- …