2,988 research outputs found
Zero-Shot Learning by Convex Combination of Semantic Embeddings
Several recent publications have proposed methods for mapping images into
continuous semantic embedding spaces. In some cases the embedding space is
trained jointly with the image transformation. In other cases the semantic
embedding space is established by an independent natural language processing
task, and then the image transformation into that space is learned in a second
stage. Proponents of these image embedding systems have stressed their
advantages over the traditional \nway{} classification framing of image
understanding, particularly in terms of the promise for zero-shot learning --
the ability to correctly annotate images of previously unseen object
categories. In this paper, we propose a simple method for constructing an image
embedding system from any existing \nway{} image classifier and a semantic word
embedding model, which contains the \n class labels in its vocabulary. Our
method maps images into the semantic embedding space via convex combination of
the class label embedding vectors, and requires no additional training. We show
that this simple and direct method confers many of the advantages associated
with more complex image embedding schemes, and indeed outperforms state of the
art methods on the ImageNet zero-shot learning task
Zero-Shot Object Detection by Hybrid Region Embedding
Object detection is considered as one of the most challenging problems in
computer vision, since it requires correct prediction of both classes and
locations of objects in images. In this study, we define a more difficult
scenario, namely zero-shot object detection (ZSD) where no visual training data
is available for some of the target object classes. We present a novel approach
to tackle this ZSD problem, where a convex combination of embeddings are used
in conjunction with a detection framework. For evaluation of ZSD methods, we
propose a simple dataset constructed from Fashion-MNIST images and also a
custom zero-shot split for the Pascal VOC detection challenge. The experimental
results suggest that our method yields promising results for ZSD
Evaluation of Output Embeddings for Fine-Grained Image Classification
Image classification has advanced significantly in recent years with the
availability of large-scale image sets. However, fine-grained classification
remains a major challenge due to the annotation cost of large numbers of
fine-grained categories. This project shows that compelling classification
performance can be achieved on such categories even without labeled training
data. Given image and class embeddings, we learn a compatibility function such
that matching embeddings are assigned a higher score than mismatching ones;
zero-shot classification of an image proceeds by finding the label yielding the
highest joint compatibility score. We use state-of-the-art image features and
focus on different supervised attributes and unsupervised output embeddings
either derived from hierarchies or learned from unlabeled text corpora. We
establish a substantially improved state-of-the-art on the Animals with
Attributes and Caltech-UCSD Birds datasets. Most encouragingly, we demonstrate
that purely unsupervised output embeddings (learned from Wikipedia and improved
with fine-grained text) achieve compelling results, even outperforming the
previous supervised state-of-the-art. By combining different output embeddings,
we further improve results.Comment: @inproceedings {ARWLS15, title = {Evaluation of Output Embeddings for
Fine-Grained Image Classification}, booktitle = {IEEE Computer Vision and
Pattern Recognition}, year = {2015}, author = {Zeynep Akata and Scott Reed
and Daniel Walter and Honglak Lee and Bernt Schiele}
Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective
This paper takes a problem-oriented perspective and presents a comprehensive
review of transfer learning methods, both shallow and deep, for cross-dataset
visual recognition. Specifically, it categorises the cross-dataset recognition
into seventeen problems based on a set of carefully chosen data and label
attributes. Such a problem-oriented taxonomy has allowed us to examine how
different transfer learning approaches tackle each problem and how well each
problem has been researched to date. The comprehensive problem-oriented review
of the advances in transfer learning with respect to the problem has not only
revealed the challenges in transfer learning for visual recognition, but also
the problems (e.g. eight of the seventeen problems) that have been scarcely
studied. This survey not only presents an up-to-date technical review for
researchers, but also a systematic approach and a reference for a machine
learning practitioner to categorise a real problem and to look up for a
possible solution accordingly
- …