54,712 research outputs found

    Improved Training for Self-Training by Confidence Assessments

    Full text link
    It is well known that for some tasks, labeled data sets may be hard to gather. Therefore, we wished to tackle here the problem of having insufficient training data. We examined learning methods from unlabeled data after an initial training on a limited labeled data set. The suggested approach can be used as an online learning method on the unlabeled test set. In the general classification task, whenever we predict a label with high enough confidence, we treat it as a true label and train the data accordingly. For the semantic segmentation task, a classic example for an expensive data labeling process, we do so pixel-wise. Our suggested approaches were applied on the MNIST data-set as a proof of concept for a vision classification task and on the ADE20K data-set in order to tackle the semi-supervised semantic segmentation problem

    Explainable Semantic Medical Image Segmentation with Style

    Full text link
    Semantic medical image segmentation using deep learning has recently achieved high accuracy, making it appealing to clinical problems such as radiation therapy. However, the lack of high-quality semantically labelled data remains a challenge leading to model brittleness to small shifts to input data. Most works require extra data for semi-supervised learning and lack the interpretability of the boundaries of the training data distribution during training, which is essential for model deployment in clinical practice. We propose a fully supervised generative framework that can achieve generalisable segmentation with only limited labelled data by simultaneously constructing an explorable manifold during training. The proposed approach creates medical image style paired with a segmentation task driven discriminator incorporating end-to-end adversarial training. The discriminator is generalised to small domain shifts as much as permissible by the training data, and the generator automatically diversifies the training samples using a manifold of input features learnt during segmentation. All the while, the discriminator guides the manifold learning by supervising the semantic content and fine-grained features separately during the image diversification. After training, visualisation of the learnt manifold from the generator is available to interpret the model limits. Experiments on a fully semantic, publicly available pelvis dataset demonstrated that our method is more generalisable to shifts than other state-of-the-art methods while being more explainable using an explorable manifold

    Semantic RGB-D Image Synthesis

    Full text link
    Collecting diverse sets of training images for RGB-D semantic image segmentation is not always possible. In particular, when robots need to operate in privacy-sensitive areas like homes, the collection is often limited to a small set of locations. As a consequence, the annotated images lack diversity in appearance and approaches for RGB-D semantic image segmentation tend to overfit the training data. In this paper, we thus introduce semantic RGB-D image synthesis to address this problem. It requires synthesising a realistic-looking RGB-D image for a given semantic label map. Current approaches, however, are uni-modal and cannot cope with multi-modal data. Indeed, we show that extending uni-modal approaches to multi-modal data does not perform well. In this paper, we therefore propose a generator for multi-modal data that separates modal-independent information of the semantic layout from the modal-dependent information that is needed to generate an RGB and a depth image, respectively. Furthermore, we propose a discriminator that ensures semantic consistency between the label maps and the generated images and perceptual similarity between the real and generated images. Our comprehensive experiments demonstrate that the proposed method outperforms previous uni-modal methods by a large margin and that the accuracy of an approach for RGB-D semantic segmentation can be significantly improved by mixing real and generated images during training

    Domain Adaptive Transfer Attack (DATA)-based Segmentation Networks for Building Extraction from Aerial Images

    Full text link
    Semantic segmentation models based on convolutional neural networks (CNNs) have gained much attention in relation to remote sensing and have achieved remarkable performance for the extraction of buildings from high-resolution aerial images. However, the issue of limited generalization for unseen images remains. When there is a domain gap between the training and test datasets, CNN-based segmentation models trained by a training dataset fail to segment buildings for the test dataset. In this paper, we propose segmentation networks based on a domain adaptive transfer attack (DATA) scheme for building extraction from aerial images. The proposed system combines the domain transfer and adversarial attack concepts. Based on the DATA scheme, the distribution of the input images can be shifted to that of the target images while turning images into adversarial examples against a target network. Defending adversarial examples adapted to the target domain can overcome the performance degradation due to the domain gap and increase the robustness of the segmentation model. Cross-dataset experiments and the ablation study are conducted for the three different datasets: the Inria aerial image labeling dataset, the Massachusetts building dataset, and the WHU East Asia dataset. Compared to the performance of the segmentation network without the DATA scheme, the proposed method shows improvements in the overall IoU. Moreover, it is verified that the proposed method outperforms even when compared to feature adaptation (FA) and output space adaptation (OSA).Comment: 11pages, 12 figure
    corecore