54,752 research outputs found
Bag-Level Aggregation for Multiple Instance Active Learning in Instance Classification Problems
A growing number of applications, e.g. video surveillance and medical image
analysis, require training recognition systems from large amounts of weakly
annotated data while some targeted interactions with a domain expert are
allowed to improve the training process. In such cases, active learning (AL)
can reduce labeling costs for training a classifier by querying the expert to
provide the labels of most informative instances. This paper focuses on AL
methods for instance classification problems in multiple instance learning
(MIL), where data is arranged into sets, called bags, that are weakly labeled.
Most AL methods focus on single instance learning problems. These methods are
not suitable for MIL problems because they cannot account for the bag structure
of data. In this paper, new methods for bag-level aggregation of instance
informativeness are proposed for multiple instance active learning (MIAL). The
\textit{aggregated informativeness} method identifies the most informative
instances based on classifier uncertainty, and queries bags incorporating the
most information. The other proposed method, called \textit{cluster-based
aggregative sampling}, clusters data hierarchically in the instance space. The
informativeness of instances is assessed by considering bag labels, inferred
instance labels, and the proportion of labels that remain to be discovered in
clusters. Both proposed methods significantly outperform reference methods in
extensive experiments using benchmark data from several application domains.
Results indicate that using an appropriate strategy to address MIAL problems
yields a significant reduction in the number of queries needed to achieve the
same level of performance as single instance AL methods
ICE: Enabling Non-Experts to Build Models Interactively for Large-Scale Lopsided Problems
Quick interaction between a human teacher and a learning machine presents
numerous benefits and challenges when working with web-scale data. The human
teacher guides the machine towards accomplishing the task of interest. The
learning machine leverages big data to find examples that maximize the training
value of its interaction with the teacher. When the teacher is restricted to
labeling examples selected by the machine, this problem is an instance of
active learning. When the teacher can provide additional information to the
machine (e.g., suggestions on what examples or predictive features should be
used) as the learning task progresses, then the problem becomes one of
interactive learning.
To accommodate the two-way communication channel needed for efficient
interactive learning, the teacher and the machine need an environment that
supports an interaction language. The machine can access, process, and
summarize more examples than the teacher can see in a lifetime. Based on the
machine's output, the teacher can revise the definition of the task or make it
more precise. Both the teacher and the machine continuously learn and benefit
from the interaction.
We have built a platform to (1) produce valuable and deployable models and
(2) support research on both the machine learning and user interface challenges
of the interactive learning problem. The platform relies on a dedicated,
low-latency, distributed, in-memory architecture that allows us to construct
web-scale learning machines with quick interaction speed. The purpose of this
paper is to describe this architecture and demonstrate how it supports our
research efforts. Preliminary results are presented as illustrations of the
architecture but are not the primary focus of the paper
- …