1,511 research outputs found
F-measure Maximization in Multi-Label Classification with Conditionally Independent Label Subsets
We discuss a method to improve the exact F-measure maximization algorithm
called GFM, proposed in (Dembczynski et al. 2011) for multi-label
classification, assuming the label set can be can partitioned into
conditionally independent subsets given the input features. If the labels were
all independent, the estimation of only parameters ( denoting the number
of labels) would suffice to derive Bayes-optimal predictions in
operations. In the general case, parameters are required by GFM, to
solve the problem in operations. In this work, we show that the number
of parameters can be reduced further to , in the best case, assuming the
label set can be partitioned into conditionally independent subsets. As
this label partition needs to be estimated from the data beforehand, we use
first the procedure proposed in (Gasse et al. 2015) that finds such partition
and then infer the required parameters locally in each label subset. The latter
are aggregated and serve as input to GFM to form the Bayes-optimal prediction.
We show on a synthetic experiment that the reduction in the number of
parameters brings about significant benefits in terms of performance
Information-Theoretic Active Learning for Content-Based Image Retrieval
We propose Information-Theoretic Active Learning (ITAL), a novel batch-mode
active learning method for binary classification, and apply it for acquiring
meaningful user feedback in the context of content-based image retrieval.
Instead of combining different heuristics such as uncertainty, diversity, or
density, our method is based on maximizing the mutual information between the
predicted relevance of the images and the expected user feedback regarding the
selected batch. We propose suitable approximations to this computationally
demanding problem and also integrate an explicit model of user behavior that
accounts for possible incorrect labels and unnameable instances. Furthermore,
our approach does not only take the structure of the data but also the expected
model output change caused by the user feedback into account. In contrast to
other methods, ITAL turns out to be highly flexible and provides
state-of-the-art performance across various datasets, such as MIRFLICKR and
ImageNet.Comment: GCPR 2018 paper (14 pages text + 2 pages references + 6 pages
appendix
- …