23,564 research outputs found
Bag-Level Aggregation for Multiple Instance Active Learning in Instance Classification Problems
A growing number of applications, e.g. video surveillance and medical image
analysis, require training recognition systems from large amounts of weakly
annotated data while some targeted interactions with a domain expert are
allowed to improve the training process. In such cases, active learning (AL)
can reduce labeling costs for training a classifier by querying the expert to
provide the labels of most informative instances. This paper focuses on AL
methods for instance classification problems in multiple instance learning
(MIL), where data is arranged into sets, called bags, that are weakly labeled.
Most AL methods focus on single instance learning problems. These methods are
not suitable for MIL problems because they cannot account for the bag structure
of data. In this paper, new methods for bag-level aggregation of instance
informativeness are proposed for multiple instance active learning (MIAL). The
\textit{aggregated informativeness} method identifies the most informative
instances based on classifier uncertainty, and queries bags incorporating the
most information. The other proposed method, called \textit{cluster-based
aggregative sampling}, clusters data hierarchically in the instance space. The
informativeness of instances is assessed by considering bag labels, inferred
instance labels, and the proportion of labels that remain to be discovered in
clusters. Both proposed methods significantly outperform reference methods in
extensive experiments using benchmark data from several application domains.
Results indicate that using an appropriate strategy to address MIAL problems
yields a significant reduction in the number of queries needed to achieve the
same level of performance as single instance AL methods
Dissimilarity-based Ensembles for Multiple Instance Learning
In multiple instance learning, objects are sets (bags) of feature vectors
(instances) rather than individual feature vectors. In this paper we address
the problem of how these bags can best be represented. Two standard approaches
are to use (dis)similarities between bags and prototype bags, or between bags
and prototype instances. The first approach results in a relatively
low-dimensional representation determined by the number of training bags, while
the second approach results in a relatively high-dimensional representation,
determined by the total number of instances in the training set. In this paper
a third, intermediate approach is proposed, which links the two approaches and
combines their strengths. Our classifier is inspired by a random subspace
ensemble, and considers subspaces of the dissimilarity space, defined by
subsets of instances, as prototypes. We provide guidelines for using such an
ensemble, and show state-of-the-art performances on a range of multiple
instance learning problems.Comment: Submitted to IEEE Transactions on Neural Networks and Learning
Systems, Special Issue on Learning in Non-(geo)metric Space
Automation of motor dexterity assessment
Motor dexterity assessment is regularly performed in rehabilitation wards to establish patient status and automatization for such routinary task is sought. A system for automatizing the assessment of motor dexterity based on the Fugl-Meyer scale and with loose restrictions on sensing technologies is presented. The system consists of two main elements: 1) A data representation that abstracts the low level information obtained from a variety of sensors, into a highly separable low dimensionality encoding employing t-distributed Stochastic Neighbourhood Embedding, and, 2) central to this communication, a multi-label classifier that boosts classification rates by exploiting the fact that the classes corresponding to the individual exercises are naturally organized as a network. Depending on the targeted therapeutic movement class labels i.e. exercises scores, are highly correlated-patients who perform well in one, tends to perform well in related exercises-; and critically no node can be used as proxy of others - an exercise does not encode the information of other exercises. Over data from a cohort of 20 patients, the novel classifier outperforms classical Naive Bayes, random forest and variants of support vector machines (ANOVA: p <; 0.001). The novel multi-label classification strategy fulfills an automatic system for motor dexterity assessment, with implications for lessening therapist's workloads, reducing healthcare costs and providing support for home-based virtual rehabilitation and telerehabilitation alternatives
- …