178,515 research outputs found

    Sample and Filter: Nonparametric Scene Parsing via Efficient Filtering

    Get PDF
    Scene parsing has attracted a lot of attention in computer vision. While parametric models have proven effective for this task, they cannot easily incorporate new training data. By contrast, nonparametric approaches, which bypass any learning phase and directly transfer the labels from the training data to the query images, can readily exploit new labeled samples as they become available. Unfortunately, because of the computational cost of their label transfer procedures, state-of-the-art nonparametric methods typically filter out most training images to only keep a few relevant ones to label the query. As such, these methods throw away many images that still contain valuable information and generally obtain an unbalanced set of labeled samples. In this paper, we introduce a nonparametric approach to scene parsing that follows a sample-and-filter strategy. More specifically, we propose to sample labeled superpixels according to an image similarity score, which allows us to obtain a balanced set of samples. We then formulate label transfer as an efficient filtering procedure, which lets us exploit more labeled samples than existing techniques. Our experiments evidence the benefits of our approach over state-of-the-art nonparametric methods on two benchmark datasets.Comment: Please refer to the CVPR-2016 version of this manuscrip

    Additional Positive Enables Better Representation Learning for Medical Images

    Full text link
    This paper presents a new way to identify additional positive pairs for BYOL, a state-of-the-art (SOTA) self-supervised learning framework, to improve its representation learning ability. Unlike conventional BYOL which relies on only one positive pair generated by two augmented views of the same image, we argue that information from different images with the same label can bring more diversity and variations to the target features, thus benefiting representation learning. To identify such pairs without any label, we investigate TracIn, an instance-based and computationally efficient influence function, for BYOL training. Specifically, TracIn is a gradient-based method that reveals the impact of a training sample on a test sample in supervised learning. We extend it to the self-supervised learning setting and propose an efficient batch-wise per-sample gradient computation method to estimate the pairwise TracIn to represent the similarity of samples in the mini-batch during training. For each image, we select the most similar sample from other images as the additional positive and pull their features together with BYOL loss. Experimental results on two public medical datasets (i.e., ISIC 2019 and ChestX-ray) demonstrate that the proposed method can improve the classification performance compared to other competitive baselines in both semi-supervised and transfer learning settings.Comment: 8 page

    Label-efficient Contrastive Learning-based model for nuclei detection and classification in 3D Cardiovascular Immunofluorescent Images

    Full text link
    Recently, deep learning-based methods achieved promising performance in nuclei detection and classification applications. However, training deep learning-based methods requires a large amount of pixel-wise annotated data, which is time-consuming and labor-intensive, especially in 3D images. An alternative approach is to adapt weak-annotation methods, such as labeling each nucleus with a point, but this method does not extend from 2D histopathology images (for which it was originally developed) to 3D immunofluorescent images. The reason is that 3D images contain multiple channels (z-axis) for nuclei and different markers separately, which makes training using point annotations difficult. To address this challenge, we propose the Label-efficient Contrastive learning-based (LECL) model to detect and classify various types of nuclei in 3D immunofluorescent images. Previous methods use Maximum Intensity Projection (MIP) to convert immunofluorescent images with multiple slices to 2D images, which can cause signals from different z-stacks to falsely appear associated with each other. To overcome this, we devised an Extended Maximum Intensity Projection (EMIP) approach that addresses issues using MIP. Furthermore, we performed a Supervised Contrastive Learning (SCL) approach for weakly supervised settings. We conducted experiments on cardiovascular datasets and found that our proposed framework is effective and efficient in detecting and classifying various types of nuclei in 3D immunofluorescent images.Comment: 11 pages, 5 figures, MICCAI Workshop Conference 202

    Deep Learning Models to Characterize Smooth Muscle Fibers in Hematoxylin and Eosin Stained Histopathological Images of the Urinary Bladder

    Get PDF
    Muscularis propria (MP) and muscularis mucosa (MM), two types of smooth muscle fibers in the urinary bladder, are major benchmarks in staging bladder cancer to distinguish between muscle-invasive (MP invasion) and non-muscle-invasive (MM invasion) diseases. While patients with non-muscle-invasive tumor can be treated conservatively involving transurethral resection (TUR) only, more aggressive treatment options, such as removal of the entire bladder, known as radical cystectomy (RC) which may severely degrade the quality of patient’s life, are often required in those with muscle-invasive tumor. Hence, given two types of image datasets, hematoxylin & eosin-stained histopathological images from RC and TUR specimens, we propose the first deep learning-based method for efficient characterization of MP. The proposed method is intended to aid the pathologists as a decision support system by facilitating accurate staging of bladder cancer. In this work, we aim to semantically segment the TUR images into MP and non-MP regions using two different approaches, patch-to-label and pixel-to-label. We evaluate four different state-of-the-art CNN-based models (VGG16, ResNet18, SqueezeNet, and MobileNetV2) and semantic segmentation-based models (U-Net, MA-Net, DeepLabv3+, and FPN) and compare their performance metrics at the pixel-level. The SqueezeNet model (mean Jaccard Index: 95.44%, mean dice coefficient: 97.66%) in patch-to-label approach and the MA-Net model (mean Jaccard Index: 96.64%, mean dice coefficient: 98.29%) in pixel-to-label approach are the best among tested models. Although pixel-to-label approach is marginally better than the patch-to-label approach based on evaluation metrics, the latter is computationally efficient using least trainable parameters

    Quantifying Facial Age by Posterior of Age Comparisons

    Full text link
    We introduce a novel approach for annotating large quantity of in-the-wild facial images with high-quality posterior age distribution as labels. Each posterior provides a probability distribution of estimated ages for a face. Our approach is motivated by observations that it is easier to distinguish who is the older of two people than to determine the person's actual age. Given a reference database with samples of known ages and a dataset to label, we can transfer reliable annotations from the former to the latter via human-in-the-loop comparisons. We show an effective way to transform such comparisons to posterior via fully-connected and SoftMax layers, so as to permit end-to-end training in a deep network. Thanks to the efficient and effective annotation approach, we collect a new large-scale facial age dataset, dubbed `MegaAge', which consists of 41,941 images. Data can be downloaded from our project page mmlab.ie.cuhk.edu.hk/projects/MegaAge and github.com/zyx2012/Age_estimation_BMVC2017. With the dataset, we train a network that jointly performs ordinal hyperplane classification and posterior distribution learning. Our approach achieves state-of-the-art results on popular benchmarks such as MORPH2, Adience, and the newly proposed MegaAge.Comment: To appear on BMVC 2017 (oral) revised versio
    • …
    corecore