21 research outputs found
Dissimilarity-based Ensembles for Multiple Instance Learning
In multiple instance learning, objects are sets (bags) of feature vectors
(instances) rather than individual feature vectors. In this paper we address
the problem of how these bags can best be represented. Two standard approaches
are to use (dis)similarities between bags and prototype bags, or between bags
and prototype instances. The first approach results in a relatively
low-dimensional representation determined by the number of training bags, while
the second approach results in a relatively high-dimensional representation,
determined by the total number of instances in the training set. In this paper
a third, intermediate approach is proposed, which links the two approaches and
combines their strengths. Our classifier is inspired by a random subspace
ensemble, and considers subspaces of the dissimilarity space, defined by
subsets of instances, as prototypes. We provide guidelines for using such an
ensemble, and show state-of-the-art performances on a range of multiple
instance learning problems.Comment: Submitted to IEEE Transactions on Neural Networks and Learning
Systems, Special Issue on Learning in Non-(geo)metric Space
5M: Multi-Instance Multi-Cluster based Weakly Supervised MIL Model for Multimedia Data Mining
The high pace rise in online as well as offline multimedia unannotated data and associated mining applications have demanded certain efficient mining algorithm. Multiple instance learning (MIL) has emerged as one of the most effective solutions for huge unannotated data mining. Still, it requires enhancement in instance selection to enable optimal mining and classification of huge multimedia data. Considering critical multimedia mining applications, such as medical data processing or content based information retrieval, the instance verification can be of great significance to optimize MIL. With this motivation, in this paper, Multi-Instance, Multi-Cluster based MIL scheme (MIMC-MIL) has been proposed to perform efficient multimedia data mining and classification with huge unannotated data with different features. The proposed system employs softmax approximation techniques with a novel loss factor and inter-instance distance based weight estimation scheme for instance probability substantiation in bags