Search CORE

4 research outputs found

Deep Neural Networks In Fully Connected CRF For Image Labeling With Social Network Metadata

Author: Collins Roddy
Hoogs Anthony
Long Chengjiang
Swears Eran
Publication venue
Publication date: 27/01/2018
Field of study

We propose a novel method for predicting image labels by fusing image content descriptors with the social media context of each image. An image uploaded to a social media site such as Flickr often has meaningful, associated information, such as comments and other images the user has uploaded, that is complementary to pixel content and helpful in predicting labels. Prediction challenges such as ImageNet~\cite{imagenet_cvpr09} and MSCOCO~\cite{LinMBHPRDZ:ECCV14} use only pixels, while other methods make predictions purely from social media context \cite{McAuleyECCV12}. Our method is based on a novel fully connected Conditional Random Field (CRF) framework, where each node is an image, and consists of two deep Convolutional Neural Networks (CNN) and one Recurrent Neural Network (RNN) that model both textual and visual node/image information. The edge weights of the CRF graph represent textual similarity and link-based metadata such as user sets and image groups. We model the CRF as an RNN for both learning and inference, and incorporate the weighted ranking loss and cross entropy loss into the CRF parameter optimization to handle the training data imbalance issue. Our proposed approach is evaluated on the MIR-9K dataset and experimentally outperforms current state-of-the-art approaches

arXiv.org e-Print Archive

A new approach to descriptors generation for image retrieval by analyzing activations of deep neural network layers

Author: Cao Jinde
Jaworski Maciej
Rutkowski Leszek
Staszewski Paweł
Publication venue
Publication date: 13/07/2020
Field of study

In this paper, we consider the problem of descriptors construction for the task of content-based image retrieval using deep neural networks. The idea of neural codes, based on fully connected layers activations, is extended by incorporating the information contained in convolutional layers. It is known that the total number of neurons in the convolutional part of the network is large and the majority of them have little influence on the final classification decision. Therefore, in the paper we propose a novel algorithm that allows us to extract the most significant neuron activations and utilize this information to construct effective descriptors. The descriptors consisting of values taken from both the fully connected and convolutional layers perfectly represent the whole image content. The images retrieved using these descriptors match semantically very well to the query image, and also they are similar in other secondary image characteristics, like background, textures or color distribution. These features of the proposed descriptors are verified experimentally based on the IMAGENET1M dataset using the VGG16 neural network.Comment:

arXiv.org e-Print Archive

Development of Conditional Random Field Insert for UNet-based Zonal Prostate Segmentation on T2-Weighted MRI

Author: Cao Peng
Korn Natalie
Kramer Sage P.
Larson Peder
Leynes Andrew P.
Noworolski Susan M.
Pedoia Valentina
Starobinets Olga
Westphalen Antonio C.
Publication venue
Publication date: 15/02/2020
Field of study

Purpose: A conventional 2D UNet convolutional neural network (CNN) architecture may result in ill-defined boundaries in segmentation output. Several studies imposed stronger constraints on each level of UNet to improve the performance of 2D UNet, such as SegNet. In this study, we investigated 2D SegNet and a proposed conditional random field insert (CRFI) for zonal prostate segmentation from clinical T2-weighted MRI data. Methods: We introduced a new methodology that combines SegNet and CRFI to improve the accuracy and robustness of the segmentation. CRFI has feedback connections that encourage the data consistency at multiple levels of the feature pyramid. On the encoder side of the SegNet, the CRFI combines the input feature maps and convolution block output based on their spatial local similarity, like a trainable bilateral filter. For all networks, 725 2D images (i.e., 29 MRI cases) were used in training; while, 174 2D images (i.e., 6 cases) were used in testing. Results: The SegNet with CRFI achieved the relatively high Dice coefficients (0.76, 0.84, and 0.89) for the peripheral zone, central zone, and whole gland, respectively. Compared with UNet, the SegNet+CRFIs segmentation has generally higher Dice score and showed the robustness in determining the boundaries of anatomical structures compared with the SegNet or UNet segmentation. The SegNet with a CRFI at the end showed the CRFI can correct the segmentation errors from SegNet output, generating smooth and consistent segmentation for the prostate. Conclusion: UNet based deep neural networks demonstrated in this study can perform zonal prostate segmentation, achieving high Dice coefficients compared with those in the literature. The proposed CRFI method can reduce the fuzzy boundaries that affected the segmentation performance of baseline UNet and SegNet models

arXiv.org e-Print Archive

VITAL: A Visual Interpretation on Text with Adversarial Learning for Image Labeling

Author: Hu Tao
Long Chengjiang
Xiao Chunxia
Zhang Leheng
Publication venue
Publication date: 01/08/2019
Field of study

In this paper, we propose a novel way to interpret text information by extracting visual feature presentation from multiple high-resolution and photo-realistic synthetic images generated by Text-to-image Generative Adversarial Network (GAN) to improve the performance of image labeling. Firstly, we design a stacked Generative Multi-Adversarial Network (GMAN), StackGMAN++, a modified version of the current state-of-the-art Text-to-image GAN, StackGAN++, to generate multiple synthetic images with various prior noises conditioned on a text. And then we extract deep visual features from the generated synthetic images to explore the underlying visual concepts for text. Finally, we combine image-level visual feature, text-level feature and visual features based on synthetic images together to predict labels for images. We conduct experiments on two benchmark datasets and the experimental results clearly demonstrate the efficacy of our proposed approach

arXiv.org e-Print Archive