3,664 research outputs found

    Bag-Level Aggregation for Multiple Instance Active Learning in Instance Classification Problems

    Full text link
    A growing number of applications, e.g. video surveillance and medical image analysis, require training recognition systems from large amounts of weakly annotated data while some targeted interactions with a domain expert are allowed to improve the training process. In such cases, active learning (AL) can reduce labeling costs for training a classifier by querying the expert to provide the labels of most informative instances. This paper focuses on AL methods for instance classification problems in multiple instance learning (MIL), where data is arranged into sets, called bags, that are weakly labeled. Most AL methods focus on single instance learning problems. These methods are not suitable for MIL problems because they cannot account for the bag structure of data. In this paper, new methods for bag-level aggregation of instance informativeness are proposed for multiple instance active learning (MIAL). The \textit{aggregated informativeness} method identifies the most informative instances based on classifier uncertainty, and queries bags incorporating the most information. The other proposed method, called \textit{cluster-based aggregative sampling}, clusters data hierarchically in the instance space. The informativeness of instances is assessed by considering bag labels, inferred instance labels, and the proportion of labels that remain to be discovered in clusters. Both proposed methods significantly outperform reference methods in extensive experiments using benchmark data from several application domains. Results indicate that using an appropriate strategy to address MIAL problems yields a significant reduction in the number of queries needed to achieve the same level of performance as single instance AL methods

    Socializing the Semantic Gap: A Comparative Survey on Image Tag Assignment, Refinement and Retrieval

    Get PDF
    Where previous reviews on content-based image retrieval emphasize on what can be seen in an image to bridge the semantic gap, this survey considers what people tag about an image. A comprehensive treatise of three closely linked problems, i.e., image tag assignment, refinement, and tag-based image retrieval is presented. While existing works vary in terms of their targeted tasks and methodology, they rely on the key functionality of tag relevance, i.e. estimating the relevance of a specific tag with respect to the visual content of a given image and its social context. By analyzing what information a specific method exploits to construct its tag relevance function and how such information is exploited, this paper introduces a taxonomy to structure the growing literature, understand the ingredients of the main works, clarify their connections and difference, and recognize their merits and limitations. For a head-to-head comparison between the state-of-the-art, a new experimental protocol is presented, with training sets containing 10k, 100k and 1m images and an evaluation on three test sets, contributed by various research groups. Eleven representative works are implemented and evaluated. Putting all this together, the survey aims to provide an overview of the past and foster progress for the near future.Comment: to appear in ACM Computing Survey

    A Training Framework of Robotic Operation and Image Analysis for Decision-Making in Bridge Inspection and Preservation

    Get PDF
    This project aims to create a framework of training engineers and policy makers on robotic operation and image analysis for the inspection and preservation of transportation infrastructure. Specifically, it develops the method for collecting camera-based bridge inspection data and the algorithms for data processing and pattern recognitions; and it creates tools for assisting users on visually analyzing the processed image data and recognized patterns for inspection and preservation decision-making. The project first developed a Siamese Neural Network to support bridge engineers in analyzing big video data. The network was initially trained by one-shot learning and is fine-tuned iteratively with human in the loop. Bridge engineers define the region of interest initially, then the algorithm retrieves all related regions in the video, which facilitates the engineers to inspect the bridge rather than exhaustively check every frame of the video. Our neural network was evaluated on three bridge inspection videos with promising performances. Then, the project developed an assistive intelligence system to facilitate inspectors efficiently and accurately detect and segment multiclass bridge elements from inspection videos. A Mask Region-based Convolutional Neural Network was transferred in the studied problem with a small initial training dataset labeled by the inspector. Then, the temporal coherence analysis was used to recover false negative detections of the transferred network. Finally, self-training with a guidance from experienced inspectors was used to iteratively refine the network. Results from a case study have demonstrated that the proposed method uses just a small amount of time and guidance from experienced inspectors to successfully build the assistive intelligence system with an excellent performance
    • …
    corecore