Search CORE

663,785 research outputs found

Bag-Level Aggregation for Multiple Instance Active Learning in Instance Classification Problems

Author: Carbonneau Marc-André
Gagnon Ghyslain
Granger Eric
Publication venue
Publication date: 06/10/2017
Field of study

A growing number of applications, e.g. video surveillance and medical image analysis, require training recognition systems from large amounts of weakly annotated data while some targeted interactions with a domain expert are allowed to improve the training process. In such cases, active learning (AL) can reduce labeling costs for training a classifier by querying the expert to provide the labels of most informative instances. This paper focuses on AL methods for instance classification problems in multiple instance learning (MIL), where data is arranged into sets, called bags, that are weakly labeled. Most AL methods focus on single instance learning problems. These methods are not suitable for MIL problems because they cannot account for the bag structure of data. In this paper, new methods for bag-level aggregation of instance informativeness are proposed for multiple instance active learning (MIAL). The \textit{aggregated informativeness} method identifies the most informative instances based on classifier uncertainty, and queries bags incorporating the most information. The other proposed method, called \textit{cluster-based aggregative sampling}, clusters data hierarchically in the instance space. The informativeness of instances is assessed by considering bag labels, inferred instance labels, and the proportion of labels that remain to be discovered in clusters. Both proposed methods significantly outperform reference methods in extensive experiments using benchmark data from several application domains. Results indicate that using an appropriate strategy to address MIAL problems yields a significant reduction in the number of queries needed to achieve the same level of performance as single instance AL methods

arXiv.org e-Print Archive

Distribution of Behaviour into Parallel Communicating Subsystems

Author: Duhaiby Omar al
Groote Jan Friso
Publication venue: 'Open Publishing Association'
Publication date: 01/01/2019
Field of study

The process of decomposing a complex system into simpler subsystems has been of interest to computer scientists over many decades, for instance, for the field of distributed computing. In this paper, motivated by the desire to distribute the process of active automata learning onto multiple subsystems, we study the equivalence between a system and the total behaviour of its decomposition which comprises subsystems with communication between them. We show synchronously- and asynchronously-communicating decompositions that maintain branching bisimilarity, and we prove that there is no decomposition operator that maintains divergence-preserving branching bisimilarity over all LTSs.Comment: In Proceedings EXPRESS/SOS 2019, arXiv:1908.0821

arXiv.org e-Print Archive

Repository TU/e

Pure OAI Repository

Diverse expected gradient active learning for relative attributes

Author: Tao D
Wang R
You X
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

The use of relative attributes for semantic understanding of images and videos is a promising way to improve communication between humans and machines. However, it is extremely labor- and time-consuming to define multiple attributes for each instance in large amount of data. One option is to incorporate active learning, so that the informative samples can be actively discovered and then labeled. However, most existing active-learning methods select samples one at a time (serial mode), and may therefore lose efficiency when learning multiple attributes. In this paper, we propose a batch-mode active-learning method, called diverse expected gradient active learning. This method integrates an informativeness analysis and a diversity analysis to form a diverse batch of queries. Specifically, the informativeness analysis employs the expected pairwise gradient length as a measure of informativeness, while the diversity analysis forces a constraint on the proposed diverse gradient angle. Since simultaneous optimization of these two parts is intractable, we utilize a two-step procedure to obtain the diverse batch of queries. A heuristic method is also introduced to suppress imbalanced multiclass distributions. Empirical evaluations of three different databases demonstrate the effectiveness and efficiency of the proposed approach. © 1992-2012 IEEE

OPUS - University of Technology Sydney

Implementation of Multiple-Instance Learning in Drug Activity Prediction

Author: Fu Gang
Publication venue: eGrove
Publication date: 01/01/2012
Field of study

In the context of drug discovery and development, much effort has been exerted to determine which conformers of a given molecule are responsible for the observed biological activity. In this work we aimed to predict bioactive conformers using a variant of supervised learning, named multiple-instance learning. A single molecule, treated as a bag of conformers, is biologically active if and only if at least one of its conformers, treated as an instance, is responsible for the observed bioactivity; and a molecule is inactive if none of its conformers is responsible for the observed bioactivity. The implementation requires instance-based embedding, and joint feature selection and classification. The goal of the present project is to implement multiple-instance learning in drug activity prediction, and subsequently to identify the bioactive conformers for each molecule. We encoded the 3-dimensional structures using pharmacophore fingerprints which are binary strings, and accomplished instance-based embedding using calculated dissimilarity distances. Four dissimilarity measures were employed and their performances were compared. 1-norm SVM was used for joint feature selection and classification. The approach was applied to four data sets, and the best proposed model for each data set was determined by using the dissimilarity measure yielding the smallest number of selected features. The predictive abilities of the proposed approach were compared with three classical predictive models without instance-based embedding. The proposed approach produced the best predictive models for one data set and second best predictive models for the rest of the data sets, based on the external validations. To validate the ability of the proposed approach to find bioactive conformers, 12 small molecules with co-crystallized structures were seeded in one data set. 10 out of 12 co-crystallized structures were indeed identified as significant conformers using the proposed approach. The proposed approach was demonstrated to be highly competitive with classical predictive models, hence it is very powerful for drug activity prediction. The approach was also validated as a useful method for pursuit of bioactive conformers

eGrove (Univ. of Mississippi)

Learning Concepts through Multi-Class Diverse Density

Author: Hiransoog Chalita
Publication venue: The University of Edinburgh
Publication date: 01/01/2008
Field of study

Institute of Perception, Action and BehaviourThis research investigates the possibility of creating an intelligent system based on the philosophy that the world is ambiguous and a system gains knowledge by learning from these ambiguous examples where the learning can especially be improved when a system is allowed to play an active role in requesting these ambiguous examples. The above philosophy will bridge the gap between the traditional Artificial Intelligence (knowledgebased AI) and the behaviour-oriented Artificial Intelligence (intelligence emerging from behaviour). Concept learning, due to its simplicity and features needed to prove this philosophy, is chosen as the studied platform. Based on the aforementioned philosophy, the task of concept learning is comparable to the multiple-instance learning framework where the learning framework will be modified to tackle more classes compared the the original two-class problem, named here as the multi-class problem. The multi-class multipleinstance learning problem is thus defined. One of the methods used to solve the original multiple-instance learning framework, the Diverse Density method, is selected due to its simplicity, robustness, and incremental property. The method is then modified to solve the newly defined multi-class multiple-instance learning problem. To explore the functionality and the efficiency, the modified method, multi-class Diverse Density, was tested on both artificial data and real-world applications: stock prediction task, assembly task, and document search. It was found that redefining the two-class problem as multi-class problems allows a wider range of ambiguous concepts to be better captured than is possible with the original multiple-instance learning framework. Moreover interactivity, the ability to play an active role in requesting or suggesting examples to learn, was proven to enhance the learning process when integrated into the multi-class Diverse Density method. In summary this research proves that the task of concept learning of ambiguous objects can be solved using the proposed multi-class Diverse Density method where the added interactivity feature improves the learning furthe

Edinburgh Research Archive