2,671 research outputs found
Modeling Multiple Annotator Expertise in the Semi-Supervised Learning Scenario
Learning algorithms normally assume that there is at most one annotation or
label per data point. However, in some scenarios, such as medical diagnosis and
on-line collaboration,multiple annotations may be available. In either case,
obtaining labels for data points can be expensive and time-consuming (in some
circumstances ground-truth may not exist). Semi-supervised learning approaches
have shown that utilizing the unlabeled data is often beneficial in these
cases. This paper presents a probabilistic semi-supervised model and algorithm
that allows for learning from both unlabeled and labeled data in the presence
of multiple annotators. We assume that it is known what annotator labeled which
data points. The proposed approach produces annotator models that allow us to
provide (1) estimates of the true label and (2) annotator variable expertise
for both labeled and unlabeled data. We provide numerical comparisons under
various scenarios and with respect to standard semi-supervised learning.
Experiments showed that the presented approach provides clear advantages over
multi-annotator methods that do not use the unlabeled data and over methods
that do not use multi-labeler information.Comment: Appears in Proceedings of the Twenty-Sixth Conference on Uncertainty
in Artificial Intelligence (UAI2010
Medical image retrieval and automatic annotation: VPA-SABANCI at ImageCLEF 2009
Advances in the medical imaging technology has lead to an exponential growth in the number of digital images that needs to be acquired, analyzed, classified, stored and retrieved in medical centers. As a result, medical image classification and retrieval has recently gained high interest in the scientific community. Despite several attempts, such as the yearly-held ImageCLEF Medical Image Annotation Competition, the proposed solutions are still far from being su±ciently accurate for real-life implementations.
In this paper we summarize the technical details of our experiments for the ImageCLEF 2009 medical image annotation task. We use a direct and two hierarchical
classification schemes that employ support vector machines and local binary patterns, which are recently developed low-cost texture descriptors. The direct scheme employs a single SVM to automatically annotate X-ray images. The two proposed hierarchi-cal schemes divide the classification task into sub-problems. The first hierarchical scheme exploits ensemble SVMs trained on IRMA sub-codes. The second learns from subgroups of data defined by frequency of classes. Our experiments show that hier-archical annotation of images by training individual SVMs over each IRMA sub-code dominates its rivals in annotation accuracy with increased process time relative to the direct scheme
Active Mining of Parallel Video Streams
The practicality of a video surveillance system is adversely limited by the
amount of queries that can be placed on human resources and their vigilance in
response. To transcend this limitation, a major effort under way is to include
software that (fully or at least semi) automatically mines video footage,
reducing the burden imposed to the system. Herein, we propose a semi-supervised
incremental learning framework for evolving visual streams in order to develop
a robust and flexible track classification system. Our proposed method learns
from consecutive batches by updating an ensemble in each time. It tries to
strike a balance between performance of the system and amount of data which
needs to be labelled. As no restriction is considered, the system can address
many practical problems in an evolving multi-camera scenario, such as concept
drift, class evolution and various length of video streams which have not been
addressed before. Experiments were performed on synthetic as well as real-world
visual data in non-stationary environments, showing high accuracy with fairly
little human collaboration
Refining Image Categorization by Exploiting Web Images and General Corpus
Studies show that refining real-world categories into semantic subcategories
contributes to better image modeling and classification. Previous image
sub-categorization work relying on labeled images and WordNet's hierarchy is
not only labor-intensive, but also restricted to classify images into NOUN
subcategories. To tackle these problems, in this work, we exploit general
corpus information to automatically select and subsequently classify web images
into semantic rich (sub-)categories. The following two major challenges are
well studied: 1) noise in the labels of subcategories derived from the general
corpus; 2) noise in the labels of images retrieved from the web. Specifically,
we first obtain the semantic refinement subcategories from the text perspective
and remove the noise by the relevance-based approach. To suppress the search
error induced noisy images, we then formulate image selection and classifier
learning as a multi-class multi-instance learning problem and propose to solve
the employed problem by the cutting-plane algorithm. The experiments show
significant performance gains by using the generated data of our way on both
image categorization and sub-categorization tasks. The proposed approach also
consistently outperforms existing weakly supervised and web-supervised
approaches
Novelty Detection Under Multi-Instance Multi-Label Framework
Novelty detection plays an important role in machine learning and signal
processing. This paper studies novelty detection in a new setting where the
data object is represented as a bag of instances and associated with multiple
class labels, referred to as multi-instance multi-label (MIML) learning.
Contrary to the common assumption in MIML that each instance in a bag belongs
to one of the known classes, in novelty detection, we focus on the scenario
where bags may contain novel-class instances. The goal is to determine, for any
given instance in a new bag, whether it belongs to a known class or a novel
class. Detecting novelty in the MIML setting captures many real-world phenomena
and has many potential applications. For example, in a collection of tagged
images, the tag may only cover a subset of objects existing in the images.
Discovering an object whose class has not been previously tagged can be useful
for the purpose of soliciting a label for the new object class. To address this
novel problem, we present a discriminative framework for detecting new class
instances. Experiments demonstrate the effectiveness of our proposed method,
and reveal that the presence of unlabeled novel instances in training bags is
helpful to the detection of such instances in testing stage
Weakly supervised segment annotation via expectation kernel density estimation
Since the labelling for the positive images/videos is ambiguous in weakly
supervised segment annotation, negative mining based methods that only use the
intra-class information emerge. In these methods, negative instances are
utilized to penalize unknown instances to rank their likelihood of being an
object, which can be considered as a voting in terms of similarity. However,
these methods 1) ignore the information contained in positive bags, 2) only
rank the likelihood but cannot generate an explicit decision function. In this
paper, we propose a voting scheme involving not only the definite negative
instances but also the ambiguous positive instances to make use of the extra
useful information in the weakly labelled positive bags. In the scheme, each
instance votes for its label with a magnitude arising from the similarity, and
the ambiguous positive instances are assigned soft labels that are iteratively
updated during the voting. It overcomes the limitations of voting using only
the negative bags. We also propose an expectation kernel density estimation
(eKDE) algorithm to gain further insight into the voting mechanism.
Experimental results demonstrate the superiority of our scheme beyond the
baselines.Comment: 9 pages, 2 figure
Clue: Cross-modal Coherence Modeling for Caption Generation
We use coherence relations inspired by computational models of discourse to
study the information needs and goals of image captioning. Using an annotation
protocol specifically devised for capturing image--caption coherence relations,
we annotate 10,000 instances from publicly-available image--caption pairs. We
introduce a new task for learning inferences in imagery and text, coherence
relation prediction, and show that these coherence annotations can be exploited
to learn relation classifiers as an intermediary step, and also train
coherence-aware, controllable image captioning models. The results show a
dramatic improvement in the consistency and quality of the generated captions
with respect to information needs specified via coherence relations.Comment: Accepted as a long paper to ACL 202
Multi-Label Classification of Patient Notes a Case Study on ICD Code Assignment
In the context of the Electronic Health Record, automated diagnosis coding of
patient notes is a useful task, but a challenging one due to the large number
of codes and the length of patient notes. We investigate four models for
assigning multiple ICD codes to discharge summaries taken from both MIMIC II
and III. We present Hierarchical Attention-GRU (HA-GRU), a hierarchical
approach to tag a document by identifying the sentences relevant for each
label. HA-GRU achieves state-of-the art results. Furthermore, the learned
sentence-level attention layer highlights the model decision process, allows
easier error analysis, and suggests future directions for improvement
Diverse Yet Efficient Retrieval using Hash Functions
Typical retrieval systems have three requirements: a) Accurate retrieval
i.e., the method should have high precision, b) Diverse retrieval, i.e., the
obtained set of points should be diverse, c) Retrieval time should be small.
However, most of the existing methods address only one or two of the above
mentioned requirements. In this work, we present a method based on randomized
locality sensitive hashing which tries to address all of the above requirements
simultaneously. While earlier hashing approaches considered approximate
retrieval to be acceptable only for the sake of efficiency, we argue that one
can further exploit approximate retrieval to provide impressive trade-offs
between accuracy and diversity. We extend our method to the problem of
multi-label prediction, where the goal is to output a diverse and accurate set
of labels for a given document in real-time. Moreover, we introduce a new
notion to simultaneously evaluate a method's performance for both the precision
and diversity measures. Finally, we present empirical results on several
different retrieval tasks and show that our method retrieves diverse and
accurate images/labels while ensuring -speed-up over the existing diverse
retrieval approaches.Comment: 10 page
STAIR Actions: A Video Dataset of Everyday Home Actions
A new large-scale video dataset for human action recognition, called STAIR
Actions is introduced. STAIR Actions contains 100 categories of action labels
representing fine-grained everyday home actions so that it can be applied to
research in various home tasks such as nursing, caring, and security. In STAIR
Actions, each video has a single action label. Moreover, for each action
category, there are around 1,000 videos that were obtained from YouTube or
produced by crowdsource workers. The duration of each video is mostly five to
six seconds. The total number of videos is 102,462. We explain how we
constructed STAIR Actions and show the characteristics of STAIR Actions
compared to existing datasets for human action recognition. Experiments with
three major models for action recognition show that STAIR Actions can train
large models and achieve good performance. STAIR Actions can be downloaded from
http://actions.stair.centerComment: STAIR Actions dataset can be downloaded from
http://actions.stair.cente
- …