680 research outputs found

    Optimization of Robust Loss Functions for Weakly-Labeled Image Taxonomies

    Get PDF

    Optimization of robust loss functions for weakly-labeled image taxonomies: An ImageNet case study

    Get PDF
    Trabajo presentado a la 8th International Conference EMMCVPR celebrada en St. Petersburg del 25 al 27 de julio de 2011.The recently proposed ImageNet dataset consists of several million images, each annotated with a single object category. However, these annotations may be imperfect, in the sense that many images contain multiple objects belonging to the label vocabulary. In other words, we have a multi-label problem but the annotations include only a single label (and not necessarily the most prominent). Such a setting motivates the use of a robust evaluation measure, which allows for a limited number of labels to be predicted and, as long as one of the predicted labels is correct, the overall prediction should be considered correct. This is indeed the type of evaluation measure used to assess algorithm performance in a recent competition on ImageNet data. Optimizing such types of performance measures presents several hurdles even with existing structured output learning methods. Indeed, many of the current state-of-the-art methods optimize the prediction of only a single output label, ignoring this `structureÂż altogether. In this paper, we show how to directly optimize continuous surrogates of such performance measures using structured output learning techniques with latent variables. We use the output of existing binary classifiers as input features in a new learning stage which optimizes the structured loss corresponding to the robust performance measure. We present empirical evidence that this allows us to `boostÂż the performance of existing binary classifiers which are the state-of-the-art for the task of object classification in ImageNet.Part of this work was carried out when both AR and TC were at INRIA Grenoble, Rhone-Alpes. NICTA is funded by the Australian Government as represented by the Department of Broadband, Communications and the Digital Economy and the Australian Research Council through the ICT Centre of Excellence program. This work was partially funded by the QUAERO project supported by OSEO, French State agency for innovation and by MICINN under project MIPRCV Consolider Ingenio CSD2007- 00018.Peer Reviewe

    Weakly supervised training of universal visual concepts for multi-domain semantic segmentation

    Full text link
    Deep supervised models have an unprecedented capacity to absorb large quantities of training data. Hence, training on multiple datasets becomes a method of choice towards strong generalization in usual scenes and graceful performance degradation in edge cases. Unfortunately, different datasets often have incompatible labels. For instance, the Cityscapes road class subsumes all driving surfaces, while Vistas defines separate classes for road markings, manholes etc. Furthermore, many datasets have overlapping labels. For instance, pickups are labeled as trucks in VIPER, cars in Vistas, and vans in ADE20k. We address this challenge by considering labels as unions of universal visual concepts. This allows seamless and principled learning on multi-domain dataset collections without requiring any relabeling effort. Our method achieves competitive within-dataset and cross-dataset generalization, as well as ability to learn visual concepts which are not separately labeled in any of the training datasets. Experiments reveal competitive or state-of-the-art performance on two multi-domain dataset collections and on the WildDash 2 benchmark.Comment: 27 pages, 16 figures, 10 table

    Large-scale image classification using ensembles of nested dichotomies

    Get PDF
    Many techniques to reduce the cost at test time in large-scale problems involve a hierarchical organization of classifiers, but are either too expensive to learn or degrade the classification performance. Conversely, in this work we show that using ensembles of randomized hierarchical decompositions of the original problem can both improve the accuracy and reduce the computational complexity at test time. The proposed method is evaluated in the ImageNet Large Scale Visual Recognition Challenge’10, with promising results.Peer ReviewedPostprint (author’s final draft
    • …
    corecore