48,231 research outputs found
Bag-Level Aggregation for Multiple Instance Active Learning in Instance Classification Problems
A growing number of applications, e.g. video surveillance and medical image
analysis, require training recognition systems from large amounts of weakly
annotated data while some targeted interactions with a domain expert are
allowed to improve the training process. In such cases, active learning (AL)
can reduce labeling costs for training a classifier by querying the expert to
provide the labels of most informative instances. This paper focuses on AL
methods for instance classification problems in multiple instance learning
(MIL), where data is arranged into sets, called bags, that are weakly labeled.
Most AL methods focus on single instance learning problems. These methods are
not suitable for MIL problems because they cannot account for the bag structure
of data. In this paper, new methods for bag-level aggregation of instance
informativeness are proposed for multiple instance active learning (MIAL). The
\textit{aggregated informativeness} method identifies the most informative
instances based on classifier uncertainty, and queries bags incorporating the
most information. The other proposed method, called \textit{cluster-based
aggregative sampling}, clusters data hierarchically in the instance space. The
informativeness of instances is assessed by considering bag labels, inferred
instance labels, and the proportion of labels that remain to be discovered in
clusters. Both proposed methods significantly outperform reference methods in
extensive experiments using benchmark data from several application domains.
Results indicate that using an appropriate strategy to address MIAL problems
yields a significant reduction in the number of queries needed to achieve the
same level of performance as single instance AL methods
DAMEWARE - Data Mining & Exploration Web Application Resource
Astronomy is undergoing through a methodological revolution triggered by an
unprecedented wealth of complex and accurate data. DAMEWARE (DAta Mining &
Exploration Web Application and REsource) is a general purpose, Web-based,
Virtual Observatory compliant, distributed data mining framework specialized in
massive data sets exploration with machine learning methods. We present the
DAMEWARE (DAta Mining & Exploration Web Application REsource) which allows the
scientific community to perform data mining and exploratory experiments on
massive data sets, by using a simple web browser. DAMEWARE offers several tools
which can be seen as working environments where to choose data analysis
functionalities such as clustering, classification, regression, feature
extraction etc., together with models and algorithms.Comment: User Manual of the DAMEWARE Web Application, 51 page
Entropy-based active learning for object recognition
Most methods for learning object categories require large amounts of labeled training data. However, obtaining such data can be a difficult and time-consuming endeavor. We have developed a novel, entropy-based ldquoactive learningrdquo approach which makes significant progress towards this problem. The main idea is to sequentially acquire labeled data by presenting an oracle (the user) with unlabeled images that will be particularly informative when labeled. Active learning adaptively prioritizes the order in which the training examples are acquired, which, as shown by our experiments, can significantly reduce the overall number of training examples required to reach near-optimal performance. At first glance this may seem counter-intuitive: how can the algorithm know whether a group of unlabeled images will be informative, when, by definition, there is no label directly associated with any of the images? Our approach is based on choosing an image to label that maximizes the expected amount of information we gain about the set of unlabeled images. The technique is demonstrated in several contexts, including improving the efficiency of Web image-search queries and open-world visual learning by an autonomous agent. Experiments on a large set of 140 visual object categories taken directly from text-based Web image searches show that our technique can provide large improvements (up to 10 x reduction in the number of training examples needed) over baseline techniques
Data-Driven Shape Analysis and Processing
Data-driven methods play an increasingly important role in discovering
geometric, structural, and semantic relationships between 3D shapes in
collections, and applying this analysis to support intelligent modeling,
editing, and visualization of geometric data. In contrast to traditional
approaches, a key feature of data-driven approaches is that they aggregate
information from a collection of shapes to improve the analysis and processing
of individual shapes. In addition, they are able to learn models that reason
about properties and relationships of shapes without relying on hard-coded
rules or explicitly programmed instructions. We provide an overview of the main
concepts and components of these techniques, and discuss their application to
shape classification, segmentation, matching, reconstruction, modeling and
exploration, as well as scene analysis and synthesis, through reviewing the
literature and relating the existing works with both qualitative and numerical
comparisons. We conclude our report with ideas that can inspire future research
in data-driven shape analysis and processing.Comment: 10 pages, 19 figure
- …