4 research outputs found

    Keypoint Based Weakly Supervised Human Parsing

    Full text link
    Fully convolutional networks (FCN) have achieved great success in human parsing in recent years. In conventional human parsing tasks, pixel-level labeling is required for guiding the training, which usually involves enormous human labeling efforts. To ease the labeling efforts, we propose a novel weakly supervised human parsing method which only requires simple object keypoint annotations for learning. We develop an iterative learning method to generate pseudo part segmentation masks from keypoint labels. With these pseudo masks, we train an FCN network to output pixel-level human parsing predictions. Furthermore, we develop a correlation network to perform joint prediction of part and object segmentation masks and improve the segmentation performance. The experiment results show that our weakly supervised method is able to achieve very competitive human parsing results. Despite our method only uses simple keypoint annotations for learning, we are able to achieve comparable performance with fully supervised methods which use the expensive pixel-level annotations

    Decoupled Spatial Neural Attention for Weakly Supervised Semantic Segmentation

    Full text link
    Weakly supervised semantic segmentation receives much research attention since it alleviates the need to obtain a large amount of dense pixel-wise ground-truth annotations for the training images. Compared with other forms of weak supervision, image labels are quite efficient to obtain. In our work, we focus on the weakly supervised semantic segmentation with image label annotations. Recent progress for this task has been largely dependent on the quality of generated pseudo-annotations. In this work, inspired by spatial neural-attention for image captioning, we propose a decoupled spatial neural attention network for generating pseudo-annotations. Our decoupled attention structure could simultaneously identify the object regions and localize the discriminative parts which generates high-quality pseudo-annotations in one forward path. The generated pseudo-annotations lead to the segmentation results which achieve the state-of-the-art in weakly-supervised semantic segmentation

    Automatic Image Labelling at Pixel Level

    Full text link
    The performance of deep networks for semantic image segmentation largely depends on the availability of large-scale training images which are labelled at the pixel level. Typically, such pixel-level image labellings are obtained manually by a labour-intensive process. To alleviate the burden of manual image labelling, we propose an interesting learning approach to generate pixel-level image labellings automatically. A Guided Filter Network (GFN) is first developed to learn the segmentation knowledge from a source domain, and such GFN then transfers such segmentation knowledge to generate coarse object masks in the target domain. Such coarse object masks are treated as pseudo labels and they are further integrated to optimize/refine the GFN iteratively in the target domain. Our experiments on six image sets have demonstrated that our proposed approach can generate fine-grained object masks (i.e., pixel-level object labellings), whose quality is very comparable to the manually-labelled ones. Our proposed approach can also achieve better performance on semantic image segmentation than most existing weakly-supervised approaches

    Show, Match and Segment: Joint Weakly Supervised Learning of Semantic Matching and Object Co-segmentation

    Full text link
    We present an approach for jointly matching and segmenting object instances of the same category within a collection of images. In contrast to existing algorithms that tackle the tasks of semantic matching and object co-segmentation in isolation, our method exploits the complementary nature of the two tasks. The key insights of our method are two-fold. First, the estimated dense correspondence fields from semantic matching provide supervision for object co-segmentation by enforcing consistency between the predicted masks from a pair of images. Second, the predicted object masks from object co-segmentation in turn allow us to reduce the adverse effects due to background clutters for improving semantic matching. Our model is end-to-end trainable and does not require supervision from manually annotated correspondences and object masks. We validate the efficacy of our approach on five benchmark datasets: TSS, Internet, PF-PASCAL, PF-WILLOW, and SPair-71k, and show that our algorithm performs favorably against the state-of-the-art methods on both semantic matching and object co-segmentation tasks.Comment: PAMI 2020. Project: https://yunchunchen.github.io/MaCoSNet-web/ Code: https://github.com/YunChunChen/MaCoSNet-pytorc
    corecore