220 research outputs found
Attribute-Graph: A Graph based approach to Image Ranking
We propose a novel image representation, termed Attribute-Graph, to rank
images by their semantic similarity to a given query image. An Attribute-Graph
is an undirected fully connected graph, incorporating both local and global
image characteristics. The graph nodes characterise objects as well as the
overall scene context using mid-level semantic attributes, while the edges
capture the object topology. We demonstrate the effectiveness of
Attribute-Graphs by applying them to the problem of image ranking. We benchmark
the performance of our algorithm on the 'rPascal' and 'rImageNet' datasets,
which we have created in order to evaluate the ranking performance on complex
queries containing multiple objects. Our experimental evaluation shows that
modelling images as Attribute-Graphs results in improved ranking performance
over existing techniques.Comment: In IEEE International Conference on Computer Vision (ICCV) 201
Data-free parameter pruning for Deep Neural Networks
Deep Neural nets (NNs) with millions of parameters are at the heart of many
state-of-the-art computer vision systems today. However, recent works have
shown that much smaller models can achieve similar levels of performance. In
this work, we address the problem of pruning parameters in a trained NN model.
Instead of removing individual weights one at a time as done in previous works,
we remove one neuron at a time. We show how similar neurons are redundant, and
propose a systematic way to remove them. Our experiments in pruning the densely
connected layers show that we can remove upto 85\% of the total parameters in
an MNIST-trained network, and about 35\% for AlexNet without significantly
affecting performance. Our method can be applied on top of most networks with a
fully connected layer to give a smaller network.Comment: BMVC 201
Image Denoising via CNNs: An Adversarial Approach
Is it possible to recover an image from its noisy version using convolutional
neural networks? This is an interesting problem as convolutional layers are
generally used as feature detectors for tasks like classification, segmentation
and object detection. We present a new CNN architecture for blind image
denoising which synergically combines three architecture components, a
multi-scale feature extraction layer which helps in reducing the effect of
noise on feature maps, an l_p regularizer which helps in selecting only the
appropriate feature maps for the task of reconstruction, and finally a three
step training approach which leverages adversarial training to give the final
performance boost to the model. The proposed model shows competitive denoising
performance when compared to the state-of-the-art approaches
Analyzing structural characteristics of object category representations from their semantic-part distributions
Studies from neuroscience show that part-mapping computations are employed by
human visual system in the process of object recognition. In this work, we
present an approach for analyzing semantic-part characteristics of object
category representations. For our experiments, we use category-epitome, a
recently proposed sketch-based spatial representation for objects. To enable
part-importance analysis, we first obtain semantic-part annotations of
hand-drawn sketches originally used to construct the corresponding epitomes. We
then examine the extent to which the semantic-parts are present in the epitomes
of a category and visualize the relative importance of parts as a word cloud.
Finally, we show how such word cloud visualizations provide an intuitive
understanding of category-level structural trends that exist in the
category-epitome object representations
Object Level Deep Feature Pooling for Compact Image Representation
Convolutional Neural Network (CNN) features have been successfully employed
in recent works as an image descriptor for various vision tasks. But the
inability of the deep CNN features to exhibit invariance to geometric
transformations and object compositions poses a great challenge for image
search. In this work, we demonstrate the effectiveness of the objectness prior
over the deep CNN features of image regions for obtaining an invariant image
representation. The proposed approach represents the image as a vector of
pooled CNN features describing the underlying objects. This representation
provides robustness to spatial layout of the objects in the scene and achieves
invariance to general geometric transformations, such as translation, rotation
and scaling. The proposed approach also leads to a compact representation of
the scene, making each image occupy a smaller memory footprint. Experiments
show that the proposed representation achieves state of the art retrieval
results on a set of challenging benchmark image datasets, while maintaining a
compact representation.Comment: Deep Vision 201
- …