3 research outputs found
On Classification with Bags, Groups and Sets
Many classification problems can be difficult to formulate directly in terms
of the traditional supervised setting, where both training and test samples are
individual feature vectors. There are cases in which samples are better
described by sets of feature vectors, that labels are only available for sets
rather than individual samples, or, if individual labels are available, that
these are not independent. To better deal with such problems, several
extensions of supervised learning have been proposed, where either training
and/or test objects are sets of feature vectors. However, having been proposed
rather independently of each other, their mutual similarities and differences
have hitherto not been mapped out. In this work, we provide an overview of such
learning scenarios, propose a taxonomy to illustrate the relationships between
them, and discuss directions for further research in these areas
A comparison of multiple instance and group based learning
In this paper we compare the performance of a number of multiple-instance learning (MIL) and group based (GB) classification algorithms on both a synthetic and real-world Pap smear dataset. We utilise the synthetic dataset to demonstrate that performance improves as both bag size and percent positives increase and that MIL outperforms GB algorithms when the percentage positives is less than 50%. However, as the positive bags become increasingly homogeneous, as is apparent on the real-world dataset, the two approaches become comparable. This result highlights that the performance of a MIL or GB algorithm will be maximised when the algorithm's MIL assumption matches the reality of the dataset. Therefore, on the Pap smear dataset, algorithms with a more generalised MIL assumption demonstrate the strongest performance. © 2012 IEEE