178,515 research outputs found
Sample and Filter: Nonparametric Scene Parsing via Efficient Filtering
Scene parsing has attracted a lot of attention in computer vision. While
parametric models have proven effective for this task, they cannot easily
incorporate new training data. By contrast, nonparametric approaches, which
bypass any learning phase and directly transfer the labels from the training
data to the query images, can readily exploit new labeled samples as they
become available. Unfortunately, because of the computational cost of their
label transfer procedures, state-of-the-art nonparametric methods typically
filter out most training images to only keep a few relevant ones to label the
query. As such, these methods throw away many images that still contain
valuable information and generally obtain an unbalanced set of labeled samples.
In this paper, we introduce a nonparametric approach to scene parsing that
follows a sample-and-filter strategy. More specifically, we propose to sample
labeled superpixels according to an image similarity score, which allows us to
obtain a balanced set of samples. We then formulate label transfer as an
efficient filtering procedure, which lets us exploit more labeled samples than
existing techniques. Our experiments evidence the benefits of our approach over
state-of-the-art nonparametric methods on two benchmark datasets.Comment: Please refer to the CVPR-2016 version of this manuscrip
Additional Positive Enables Better Representation Learning for Medical Images
This paper presents a new way to identify additional positive pairs for BYOL,
a state-of-the-art (SOTA) self-supervised learning framework, to improve its
representation learning ability. Unlike conventional BYOL which relies on only
one positive pair generated by two augmented views of the same image, we argue
that information from different images with the same label can bring more
diversity and variations to the target features, thus benefiting representation
learning. To identify such pairs without any label, we investigate TracIn, an
instance-based and computationally efficient influence function, for BYOL
training. Specifically, TracIn is a gradient-based method that reveals the
impact of a training sample on a test sample in supervised learning. We extend
it to the self-supervised learning setting and propose an efficient batch-wise
per-sample gradient computation method to estimate the pairwise TracIn to
represent the similarity of samples in the mini-batch during training. For each
image, we select the most similar sample from other images as the additional
positive and pull their features together with BYOL loss. Experimental results
on two public medical datasets (i.e., ISIC 2019 and ChestX-ray) demonstrate
that the proposed method can improve the classification performance compared to
other competitive baselines in both semi-supervised and transfer learning
settings.Comment: 8 page
Label-efficient Contrastive Learning-based model for nuclei detection and classification in 3D Cardiovascular Immunofluorescent Images
Recently, deep learning-based methods achieved promising performance in
nuclei detection and classification applications. However, training deep
learning-based methods requires a large amount of pixel-wise annotated data,
which is time-consuming and labor-intensive, especially in 3D images. An
alternative approach is to adapt weak-annotation methods, such as labeling each
nucleus with a point, but this method does not extend from 2D histopathology
images (for which it was originally developed) to 3D immunofluorescent images.
The reason is that 3D images contain multiple channels (z-axis) for nuclei and
different markers separately, which makes training using point annotations
difficult. To address this challenge, we propose the Label-efficient
Contrastive learning-based (LECL) model to detect and classify various types of
nuclei in 3D immunofluorescent images. Previous methods use Maximum Intensity
Projection (MIP) to convert immunofluorescent images with multiple slices to 2D
images, which can cause signals from different z-stacks to falsely appear
associated with each other. To overcome this, we devised an Extended Maximum
Intensity Projection (EMIP) approach that addresses issues using MIP.
Furthermore, we performed a Supervised Contrastive Learning (SCL) approach for
weakly supervised settings. We conducted experiments on cardiovascular datasets
and found that our proposed framework is effective and efficient in detecting
and classifying various types of nuclei in 3D immunofluorescent images.Comment: 11 pages, 5 figures, MICCAI Workshop Conference 202
Deep Learning Models to Characterize Smooth Muscle Fibers in Hematoxylin and Eosin Stained Histopathological Images of the Urinary Bladder
Muscularis propria (MP) and muscularis mucosa (MM), two types of smooth muscle fibers in the urinary bladder, are major benchmarks in staging bladder cancer to distinguish between muscle-invasive (MP invasion) and non-muscle-invasive (MM invasion) diseases. While patients with non-muscle-invasive tumor can be treated conservatively involving transurethral resection (TUR) only, more aggressive treatment options, such as removal of the entire bladder, known as radical cystectomy (RC) which may severely degrade the quality of patient’s life, are often required in those with muscle-invasive tumor. Hence, given two types of image datasets, hematoxylin & eosin-stained histopathological images from RC and TUR specimens, we propose the first deep learning-based method for efficient characterization of MP. The proposed method is intended to aid the pathologists as a decision support system by facilitating accurate staging of bladder cancer. In this work, we aim to semantically segment the TUR images into MP and non-MP regions using two different approaches, patch-to-label and pixel-to-label. We evaluate four different state-of-the-art CNN-based models (VGG16, ResNet18, SqueezeNet, and MobileNetV2) and semantic segmentation-based models (U-Net, MA-Net, DeepLabv3+, and FPN) and compare their performance metrics at the pixel-level. The SqueezeNet model (mean Jaccard Index: 95.44%, mean dice coefficient: 97.66%) in patch-to-label approach and the MA-Net model (mean Jaccard Index: 96.64%, mean dice coefficient: 98.29%) in pixel-to-label approach are the best among tested models. Although pixel-to-label approach is marginally better than the patch-to-label approach based on evaluation metrics, the latter is computationally efficient using least trainable parameters
Quantifying Facial Age by Posterior of Age Comparisons
We introduce a novel approach for annotating large quantity of in-the-wild
facial images with high-quality posterior age distribution as labels. Each
posterior provides a probability distribution of estimated ages for a face. Our
approach is motivated by observations that it is easier to distinguish who is
the older of two people than to determine the person's actual age. Given a
reference database with samples of known ages and a dataset to label, we can
transfer reliable annotations from the former to the latter via
human-in-the-loop comparisons. We show an effective way to transform such
comparisons to posterior via fully-connected and SoftMax layers, so as to
permit end-to-end training in a deep network. Thanks to the efficient and
effective annotation approach, we collect a new large-scale facial age dataset,
dubbed `MegaAge', which consists of 41,941 images. Data can be downloaded from
our project page mmlab.ie.cuhk.edu.hk/projects/MegaAge and
github.com/zyx2012/Age_estimation_BMVC2017. With the dataset, we train a
network that jointly performs ordinal hyperplane classification and posterior
distribution learning. Our approach achieves state-of-the-art results on
popular benchmarks such as MORPH2, Adience, and the newly proposed MegaAge.Comment: To appear on BMVC 2017 (oral) revised versio
- …