24,819 research outputs found
HD-CNN: Hierarchical Deep Convolutional Neural Network for Large Scale Visual Recognition
In image classification, visual separability between different object
categories is highly uneven, and some categories are more difficult to
distinguish than others. Such difficult categories demand more dedicated
classifiers. However, existing deep convolutional neural networks (CNN) are
trained as flat N-way classifiers, and few efforts have been made to leverage
the hierarchical structure of categories. In this paper, we introduce
hierarchical deep CNNs (HD-CNNs) by embedding deep CNNs into a category
hierarchy. An HD-CNN separates easy classes using a coarse category classifier
while distinguishing difficult classes using fine category classifiers. During
HD-CNN training, component-wise pretraining is followed by global finetuning
with a multinomial logistic loss regularized by a coarse category consistency
term. In addition, conditional executions of fine category classifiers and
layer parameter compression make HD-CNNs scalable for large-scale visual
recognition. We achieve state-of-the-art results on both CIFAR100 and
large-scale ImageNet 1000-class benchmark datasets. In our experiments, we
build up three different HD-CNNs and they lower the top-1 error of the standard
CNNs by 2.65%, 3.1% and 1.1%, respectively.Comment: Add new results on ImageNet using VGG-16-layer building block ne
Hierarchy-based Image Embeddings for Semantic Image Retrieval
Deep neural networks trained for classification have been found to learn
powerful image representations, which are also often used for other tasks such
as comparing images w.r.t. their visual similarity. However, visual similarity
does not imply semantic similarity. In order to learn semantically
discriminative features, we propose to map images onto class embeddings whose
pair-wise dot products correspond to a measure of semantic similarity between
classes. Such an embedding does not only improve image retrieval results, but
could also facilitate integrating semantics for other tasks, e.g., novelty
detection or few-shot learning. We introduce a deterministic algorithm for
computing the class centroids directly based on prior world-knowledge encoded
in a hierarchy of classes such as WordNet. Experiments on CIFAR-100, NABirds,
and ImageNet show that our learned semantic image embeddings improve the
semantic consistency of image retrieval results by a large margin.Comment: Accepted at WACV 2019. Source code:
https://github.com/cvjena/semantic-embedding
A visual embedding for the unsupervised extraction of abstract semantics
Vector-space word representations obtained from neural network models have been shown to enable semantic operations based on vector arithmetic. In this paper, we explore the existence of similar information on vector representations of images. For that purpose we define a methodology to obtain large, sparse vector representations of image classes, and generate vectors through the state-of-the-art deep learning architecture GoogLeNet for 20 K images obtained from ImageNet. We first evaluate the resultant vector-space semantics through its correlation with WordNet distances, and find vector distances to be strongly correlated with linguistic semantics. We then explore the location of images within the vector space, finding elements close in WordNet to be clustered together, regardless of significant visual variances (e.g., 118 dog types). More surprisingly, we find that the space unsupervisedly separates complex classes without prior knowledge (e.g., living things). Afterwards, we consider vector arithmetics. Although we are unable to obtain meaningful results on this regard, we discuss the various problem we encountered, and how we consider to solve them. Finally, we discuss the impact of our research for cognitive systems, focusing on the role of the architecture being used.This work is partially supported by the Joint Study Agreement no. W156463 under the IBM/BSC Deep Learning Center agreement, by the Spanish Government through Programa Severo Ochoa (SEV-2015-0493), by the Spanish Ministry of Science and Technology through TIN2015-65316-P project and by the Generalitat de Catalunya (contracts 2014-SGR-1051), and by the Core Research for Evolutional Science and Technology (CREST) program of Japan Science and Technology Agency (JST).Peer ReviewedPostprint (published version
- …