2,671 research outputs found

    Modeling Multiple Annotator Expertise in the Semi-Supervised Learning Scenario

    Full text link
    Learning algorithms normally assume that there is at most one annotation or label per data point. However, in some scenarios, such as medical diagnosis and on-line collaboration,multiple annotations may be available. In either case, obtaining labels for data points can be expensive and time-consuming (in some circumstances ground-truth may not exist). Semi-supervised learning approaches have shown that utilizing the unlabeled data is often beneficial in these cases. This paper presents a probabilistic semi-supervised model and algorithm that allows for learning from both unlabeled and labeled data in the presence of multiple annotators. We assume that it is known what annotator labeled which data points. The proposed approach produces annotator models that allow us to provide (1) estimates of the true label and (2) annotator variable expertise for both labeled and unlabeled data. We provide numerical comparisons under various scenarios and with respect to standard semi-supervised learning. Experiments showed that the presented approach provides clear advantages over multi-annotator methods that do not use the unlabeled data and over methods that do not use multi-labeler information.Comment: Appears in Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence (UAI2010

    Medical image retrieval and automatic annotation: VPA-SABANCI at ImageCLEF 2009

    Get PDF
    Advances in the medical imaging technology has lead to an exponential growth in the number of digital images that needs to be acquired, analyzed, classified, stored and retrieved in medical centers. As a result, medical image classification and retrieval has recently gained high interest in the scientific community. Despite several attempts, such as the yearly-held ImageCLEF Medical Image Annotation Competition, the proposed solutions are still far from being su±ciently accurate for real-life implementations. In this paper we summarize the technical details of our experiments for the ImageCLEF 2009 medical image annotation task. We use a direct and two hierarchical classification schemes that employ support vector machines and local binary patterns, which are recently developed low-cost texture descriptors. The direct scheme employs a single SVM to automatically annotate X-ray images. The two proposed hierarchi-cal schemes divide the classification task into sub-problems. The first hierarchical scheme exploits ensemble SVMs trained on IRMA sub-codes. The second learns from subgroups of data defined by frequency of classes. Our experiments show that hier-archical annotation of images by training individual SVMs over each IRMA sub-code dominates its rivals in annotation accuracy with increased process time relative to the direct scheme

    Active Mining of Parallel Video Streams

    Full text link
    The practicality of a video surveillance system is adversely limited by the amount of queries that can be placed on human resources and their vigilance in response. To transcend this limitation, a major effort under way is to include software that (fully or at least semi) automatically mines video footage, reducing the burden imposed to the system. Herein, we propose a semi-supervised incremental learning framework for evolving visual streams in order to develop a robust and flexible track classification system. Our proposed method learns from consecutive batches by updating an ensemble in each time. It tries to strike a balance between performance of the system and amount of data which needs to be labelled. As no restriction is considered, the system can address many practical problems in an evolving multi-camera scenario, such as concept drift, class evolution and various length of video streams which have not been addressed before. Experiments were performed on synthetic as well as real-world visual data in non-stationary environments, showing high accuracy with fairly little human collaboration

    Refining Image Categorization by Exploiting Web Images and General Corpus

    Full text link
    Studies show that refining real-world categories into semantic subcategories contributes to better image modeling and classification. Previous image sub-categorization work relying on labeled images and WordNet's hierarchy is not only labor-intensive, but also restricted to classify images into NOUN subcategories. To tackle these problems, in this work, we exploit general corpus information to automatically select and subsequently classify web images into semantic rich (sub-)categories. The following two major challenges are well studied: 1) noise in the labels of subcategories derived from the general corpus; 2) noise in the labels of images retrieved from the web. Specifically, we first obtain the semantic refinement subcategories from the text perspective and remove the noise by the relevance-based approach. To suppress the search error induced noisy images, we then formulate image selection and classifier learning as a multi-class multi-instance learning problem and propose to solve the employed problem by the cutting-plane algorithm. The experiments show significant performance gains by using the generated data of our way on both image categorization and sub-categorization tasks. The proposed approach also consistently outperforms existing weakly supervised and web-supervised approaches

    Novelty Detection Under Multi-Instance Multi-Label Framework

    Full text link
    Novelty detection plays an important role in machine learning and signal processing. This paper studies novelty detection in a new setting where the data object is represented as a bag of instances and associated with multiple class labels, referred to as multi-instance multi-label (MIML) learning. Contrary to the common assumption in MIML that each instance in a bag belongs to one of the known classes, in novelty detection, we focus on the scenario where bags may contain novel-class instances. The goal is to determine, for any given instance in a new bag, whether it belongs to a known class or a novel class. Detecting novelty in the MIML setting captures many real-world phenomena and has many potential applications. For example, in a collection of tagged images, the tag may only cover a subset of objects existing in the images. Discovering an object whose class has not been previously tagged can be useful for the purpose of soliciting a label for the new object class. To address this novel problem, we present a discriminative framework for detecting new class instances. Experiments demonstrate the effectiveness of our proposed method, and reveal that the presence of unlabeled novel instances in training bags is helpful to the detection of such instances in testing stage

    Weakly supervised segment annotation via expectation kernel density estimation

    Full text link
    Since the labelling for the positive images/videos is ambiguous in weakly supervised segment annotation, negative mining based methods that only use the intra-class information emerge. In these methods, negative instances are utilized to penalize unknown instances to rank their likelihood of being an object, which can be considered as a voting in terms of similarity. However, these methods 1) ignore the information contained in positive bags, 2) only rank the likelihood but cannot generate an explicit decision function. In this paper, we propose a voting scheme involving not only the definite negative instances but also the ambiguous positive instances to make use of the extra useful information in the weakly labelled positive bags. In the scheme, each instance votes for its label with a magnitude arising from the similarity, and the ambiguous positive instances are assigned soft labels that are iteratively updated during the voting. It overcomes the limitations of voting using only the negative bags. We also propose an expectation kernel density estimation (eKDE) algorithm to gain further insight into the voting mechanism. Experimental results demonstrate the superiority of our scheme beyond the baselines.Comment: 9 pages, 2 figure

    Clue: Cross-modal Coherence Modeling for Caption Generation

    Full text link
    We use coherence relations inspired by computational models of discourse to study the information needs and goals of image captioning. Using an annotation protocol specifically devised for capturing image--caption coherence relations, we annotate 10,000 instances from publicly-available image--caption pairs. We introduce a new task for learning inferences in imagery and text, coherence relation prediction, and show that these coherence annotations can be exploited to learn relation classifiers as an intermediary step, and also train coherence-aware, controllable image captioning models. The results show a dramatic improvement in the consistency and quality of the generated captions with respect to information needs specified via coherence relations.Comment: Accepted as a long paper to ACL 202

    Multi-Label Classification of Patient Notes a Case Study on ICD Code Assignment

    Full text link
    In the context of the Electronic Health Record, automated diagnosis coding of patient notes is a useful task, but a challenging one due to the large number of codes and the length of patient notes. We investigate four models for assigning multiple ICD codes to discharge summaries taken from both MIMIC II and III. We present Hierarchical Attention-GRU (HA-GRU), a hierarchical approach to tag a document by identifying the sentences relevant for each label. HA-GRU achieves state-of-the art results. Furthermore, the learned sentence-level attention layer highlights the model decision process, allows easier error analysis, and suggests future directions for improvement

    Diverse Yet Efficient Retrieval using Hash Functions

    Full text link
    Typical retrieval systems have three requirements: a) Accurate retrieval i.e., the method should have high precision, b) Diverse retrieval, i.e., the obtained set of points should be diverse, c) Retrieval time should be small. However, most of the existing methods address only one or two of the above mentioned requirements. In this work, we present a method based on randomized locality sensitive hashing which tries to address all of the above requirements simultaneously. While earlier hashing approaches considered approximate retrieval to be acceptable only for the sake of efficiency, we argue that one can further exploit approximate retrieval to provide impressive trade-offs between accuracy and diversity. We extend our method to the problem of multi-label prediction, where the goal is to output a diverse and accurate set of labels for a given document in real-time. Moreover, we introduce a new notion to simultaneously evaluate a method's performance for both the precision and diversity measures. Finally, we present empirical results on several different retrieval tasks and show that our method retrieves diverse and accurate images/labels while ensuring 100x100x-speed-up over the existing diverse retrieval approaches.Comment: 10 page

    STAIR Actions: A Video Dataset of Everyday Home Actions

    Full text link
    A new large-scale video dataset for human action recognition, called STAIR Actions is introduced. STAIR Actions contains 100 categories of action labels representing fine-grained everyday home actions so that it can be applied to research in various home tasks such as nursing, caring, and security. In STAIR Actions, each video has a single action label. Moreover, for each action category, there are around 1,000 videos that were obtained from YouTube or produced by crowdsource workers. The duration of each video is mostly five to six seconds. The total number of videos is 102,462. We explain how we constructed STAIR Actions and show the characteristics of STAIR Actions compared to existing datasets for human action recognition. Experiments with three major models for action recognition show that STAIR Actions can train large models and achieve good performance. STAIR Actions can be downloaded from http://actions.stair.centerComment: STAIR Actions dataset can be downloaded from http://actions.stair.cente
    • …
    corecore