4 research outputs found
Deep Neural Networks In Fully Connected CRF For Image Labeling With Social Network Metadata
We propose a novel method for predicting image labels by fusing image content
descriptors with the social media context of each image. An image uploaded to a
social media site such as Flickr often has meaningful, associated information,
such as comments and other images the user has uploaded, that is complementary
to pixel content and helpful in predicting labels. Prediction challenges such
as ImageNet~\cite{imagenet_cvpr09} and MSCOCO~\cite{LinMBHPRDZ:ECCV14} use only
pixels, while other methods make predictions purely from social media context
\cite{McAuleyECCV12}. Our method is based on a novel fully connected
Conditional Random Field (CRF) framework, where each node is an image, and
consists of two deep Convolutional Neural Networks (CNN) and one Recurrent
Neural Network (RNN) that model both textual and visual node/image information.
The edge weights of the CRF graph represent textual similarity and link-based
metadata such as user sets and image groups. We model the CRF as an RNN for
both learning and inference, and incorporate the weighted ranking loss and
cross entropy loss into the CRF parameter optimization to handle the training
data imbalance issue. Our proposed approach is evaluated on the MIR-9K dataset
and experimentally outperforms current state-of-the-art approaches
A new approach to descriptors generation for image retrieval by analyzing activations of deep neural network layers
In this paper, we consider the problem of descriptors construction for the
task of content-based image retrieval using deep neural networks. The idea of
neural codes, based on fully connected layers activations, is extended by
incorporating the information contained in convolutional layers. It is known
that the total number of neurons in the convolutional part of the network is
large and the majority of them have little influence on the final
classification decision. Therefore, in the paper we propose a novel algorithm
that allows us to extract the most significant neuron activations and utilize
this information to construct effective descriptors. The descriptors consisting
of values taken from both the fully connected and convolutional layers
perfectly represent the whole image content. The images retrieved using these
descriptors match semantically very well to the query image, and also they are
similar in other secondary image characteristics, like background, textures or
color distribution. These features of the proposed descriptors are verified
experimentally based on the IMAGENET1M dataset using the VGG16 neural network.Comment:
Development of Conditional Random Field Insert for UNet-based Zonal Prostate Segmentation on T2-Weighted MRI
Purpose: A conventional 2D UNet convolutional neural network (CNN)
architecture may result in ill-defined boundaries in segmentation output.
Several studies imposed stronger constraints on each level of UNet to improve
the performance of 2D UNet, such as SegNet. In this study, we investigated 2D
SegNet and a proposed conditional random field insert (CRFI) for zonal prostate
segmentation from clinical T2-weighted MRI data.
Methods: We introduced a new methodology that combines SegNet and CRFI to
improve the accuracy and robustness of the segmentation. CRFI has feedback
connections that encourage the data consistency at multiple levels of the
feature pyramid. On the encoder side of the SegNet, the CRFI combines the input
feature maps and convolution block output based on their spatial local
similarity, like a trainable bilateral filter. For all networks, 725 2D images
(i.e., 29 MRI cases) were used in training; while, 174 2D images (i.e., 6
cases) were used in testing.
Results: The SegNet with CRFI achieved the relatively high Dice coefficients
(0.76, 0.84, and 0.89) for the peripheral zone, central zone, and whole gland,
respectively. Compared with UNet, the SegNet+CRFIs segmentation has generally
higher Dice score and showed the robustness in determining the boundaries of
anatomical structures compared with the SegNet or UNet segmentation. The SegNet
with a CRFI at the end showed the CRFI can correct the segmentation errors from
SegNet output, generating smooth and consistent segmentation for the prostate.
Conclusion: UNet based deep neural networks demonstrated in this study can
perform zonal prostate segmentation, achieving high Dice coefficients compared
with those in the literature. The proposed CRFI method can reduce the fuzzy
boundaries that affected the segmentation performance of baseline UNet and
SegNet models
VITAL: A Visual Interpretation on Text with Adversarial Learning for Image Labeling
In this paper, we propose a novel way to interpret text information by
extracting visual feature presentation from multiple high-resolution and
photo-realistic synthetic images generated by Text-to-image Generative
Adversarial Network (GAN) to improve the performance of image labeling.
Firstly, we design a stacked Generative Multi-Adversarial Network (GMAN),
StackGMAN++, a modified version of the current state-of-the-art Text-to-image
GAN, StackGAN++, to generate multiple synthetic images with various prior
noises conditioned on a text. And then we extract deep visual features from the
generated synthetic images to explore the underlying visual concepts for text.
Finally, we combine image-level visual feature, text-level feature and visual
features based on synthetic images together to predict labels for images. We
conduct experiments on two benchmark datasets and the experimental results
clearly demonstrate the efficacy of our proposed approach