1,399 research outputs found

    Learning to Segment Every Thing

    Full text link
    Most methods for object instance segmentation require all training examples to be labeled with segmentation masks. This requirement makes it expensive to annotate new categories and has restricted instance segmentation models to ~100 well-annotated classes. The goal of this paper is to propose a new partially supervised training paradigm, together with a novel weight transfer function, that enables training instance segmentation models on a large set of categories all of which have box annotations, but only a small fraction of which have mask annotations. These contributions allow us to train Mask R-CNN to detect and segment 3000 visual concepts using box annotations from the Visual Genome dataset and mask annotations from the 80 classes in the COCO dataset. We evaluate our approach in a controlled study on the COCO dataset. This work is a first step towards instance segmentation models that have broad comprehension of the visual world

    Learning to segment with image-level supervision

    Full text link
    Deep convolutional networks have achieved the state-of-the-art for semantic image segmentation tasks. However, training these networks requires access to densely labeled images, which are known to be very expensive to obtain. On the other hand, the web provides an almost unlimited source of images annotated at the image level. How can one utilize this much larger weakly annotated set for tasks that require dense labeling? Prior work often relied on localization cues, such as saliency maps, objectness priors, bounding boxes etc., to address this challenging problem. In this paper, we propose a model that generates auxiliary labels for each image, while simultaneously forcing the output of the CNN to satisfy the mean-field constraints imposed by a conditional random field. We show that one can enforce the CRF constraints by forcing the distribution at each pixel to be close to the distribution of its neighbors. This is in stark contrast with methods that compute a recursive expansion of the mean-field distribution using a recurrent architecture and train the resultant distribution. Instead, the proposed model adds an extra loss term to the output of the CNN, and hence, is faster than recursive implementations. We achieve the state-of-the-art for weakly supervised semantic image segmentation on VOC 2012 dataset, assuming no manually labeled pixel level information is available. Furthermore, the incorporation of conditional random fields in CNN incurs little extra time during training.Comment: Published in WACV 201

    Learning to Segment Breast Biopsy Whole Slide Images

    Full text link
    We trained and applied an encoder-decoder model to semantically segment breast biopsy images into biologically meaningful tissue labels. Since conventional encoder-decoder networks cannot be applied directly on large biopsy images and the different sized structures in biopsies present novel challenges, we propose four modifications: (1) an input-aware encoding block to compensate for information loss, (2) a new dense connection pattern between encoder and decoder, (3) dense and sparse decoders to combine multi-level features, (4) a multi-resolution network that fuses the results of encoder-decoders run on different resolutions. Our model outperforms a feature-based approach and conventional encoder-decoders from the literature. We use semantic segmentations produced with our model in an automated diagnosis task and obtain higher accuracies than a baseline approach that employs an SVM for feature-based segmentation, both using the same segmentation-based diagnostic features.Comment: Added more WSI images in appendi
    • …
    corecore