23,564 research outputs found

    Bag-Level Aggregation for Multiple Instance Active Learning in Instance Classification Problems

    Full text link
    A growing number of applications, e.g. video surveillance and medical image analysis, require training recognition systems from large amounts of weakly annotated data while some targeted interactions with a domain expert are allowed to improve the training process. In such cases, active learning (AL) can reduce labeling costs for training a classifier by querying the expert to provide the labels of most informative instances. This paper focuses on AL methods for instance classification problems in multiple instance learning (MIL), where data is arranged into sets, called bags, that are weakly labeled. Most AL methods focus on single instance learning problems. These methods are not suitable for MIL problems because they cannot account for the bag structure of data. In this paper, new methods for bag-level aggregation of instance informativeness are proposed for multiple instance active learning (MIAL). The \textit{aggregated informativeness} method identifies the most informative instances based on classifier uncertainty, and queries bags incorporating the most information. The other proposed method, called \textit{cluster-based aggregative sampling}, clusters data hierarchically in the instance space. The informativeness of instances is assessed by considering bag labels, inferred instance labels, and the proportion of labels that remain to be discovered in clusters. Both proposed methods significantly outperform reference methods in extensive experiments using benchmark data from several application domains. Results indicate that using an appropriate strategy to address MIAL problems yields a significant reduction in the number of queries needed to achieve the same level of performance as single instance AL methods

    Dissimilarity-based Ensembles for Multiple Instance Learning

    Get PDF
    In multiple instance learning, objects are sets (bags) of feature vectors (instances) rather than individual feature vectors. In this paper we address the problem of how these bags can best be represented. Two standard approaches are to use (dis)similarities between bags and prototype bags, or between bags and prototype instances. The first approach results in a relatively low-dimensional representation determined by the number of training bags, while the second approach results in a relatively high-dimensional representation, determined by the total number of instances in the training set. In this paper a third, intermediate approach is proposed, which links the two approaches and combines their strengths. Our classifier is inspired by a random subspace ensemble, and considers subspaces of the dissimilarity space, defined by subsets of instances, as prototypes. We provide guidelines for using such an ensemble, and show state-of-the-art performances on a range of multiple instance learning problems.Comment: Submitted to IEEE Transactions on Neural Networks and Learning Systems, Special Issue on Learning in Non-(geo)metric Space

    Automation of motor dexterity assessment

    Get PDF
    Motor dexterity assessment is regularly performed in rehabilitation wards to establish patient status and automatization for such routinary task is sought. A system for automatizing the assessment of motor dexterity based on the Fugl-Meyer scale and with loose restrictions on sensing technologies is presented. The system consists of two main elements: 1) A data representation that abstracts the low level information obtained from a variety of sensors, into a highly separable low dimensionality encoding employing t-distributed Stochastic Neighbourhood Embedding, and, 2) central to this communication, a multi-label classifier that boosts classification rates by exploiting the fact that the classes corresponding to the individual exercises are naturally organized as a network. Depending on the targeted therapeutic movement class labels i.e. exercises scores, are highly correlated-patients who perform well in one, tends to perform well in related exercises-; and critically no node can be used as proxy of others - an exercise does not encode the information of other exercises. Over data from a cohort of 20 patients, the novel classifier outperforms classical Naive Bayes, random forest and variants of support vector machines (ANOVA: p <; 0.001). The novel multi-label classification strategy fulfills an automatic system for motor dexterity assessment, with implications for lessening therapist's workloads, reducing healthcare costs and providing support for home-based virtual rehabilitation and telerehabilitation alternatives
    • …
    corecore