41 research outputs found
Attributes2Classname: A discriminative model for attribute-based unsupervised zero-shot learning
We propose a novel approach for unsupervised zero-shot learning (ZSL) of
classes based on their names. Most existing unsupervised ZSL methods aim to
learn a model for directly comparing image features and class names. However,
this proves to be a difficult task due to dominance of non-visual semantics in
underlying vector-space embeddings of class names. To address this issue, we
discriminatively learn a word representation such that the similarities between
class and combination of attribute names fall in line with the visual
similarity. Contrary to the traditional zero-shot learning approaches that are
built upon attribute presence, our approach bypasses the laborious
attribute-class relation annotations for unseen classes. In addition, our
proposed approach renders text-only training possible, hence, the training can
be augmented without the need to collect additional image data. The
experimental results show that our method yields state-of-the-art results for
unsupervised ZSL in three benchmark datasets.Comment: To appear at IEEE Int. Conference on Computer Vision (ICCV) 201
Learning to Reconstruct Shapes from Unseen Classes
From a single image, humans are able to perceive the full 3D shape of an
object by exploiting learned shape priors from everyday life. Contemporary
single-image 3D reconstruction algorithms aim to solve this task in a similar
fashion, but often end up with priors that are highly biased by training
classes. Here we present an algorithm, Generalizable Reconstruction (GenRe),
designed to capture more generic, class-agnostic shape priors. We achieve this
with an inference network and training procedure that combine 2.5D
representations of visible surfaces (depth and silhouette), spherical shape
representations of both visible and non-visible surfaces, and 3D voxel-based
representations, in a principled manner that exploits the causal structure of
how 3D shapes give rise to 2D images. Experiments demonstrate that GenRe
performs well on single-view shape reconstruction, and generalizes to diverse
novel objects from categories not seen during training.Comment: NeurIPS 2018 (Oral). The first two authors contributed equally to
this paper. Project page: http://genre.csail.mit.edu
Zero Shot Learning with the Isoperimetric Loss
We introduce the isoperimetric loss as a regularization criterion for
learning the map from a visual representation to a semantic embedding, to be
used to transfer knowledge to unknown classes in a zero-shot learning setting.
We use a pre-trained deep neural network model as a visual representation of
image data, a Word2Vec embedding of class labels, and linear maps between the
visual and semantic embedding spaces. However, the spaces themselves are not
linear, and we postulate the sample embedding to be populated by noisy samples
near otherwise smooth manifolds. We exploit the graph structure defined by the
sample points to regularize the estimates of the manifolds by inferring the
graph connectivity using a generalization of the isoperimetric inequalities
from Riemannian geometry to graphs. Surprisingly, this regularization alone,
paired with the simplest baseline model, outperforms the state-of-the-art among
fully automated methods in zero-shot learning benchmarks such as AwA and CUB.
This improvement is achieved solely by learning the structure of the underlying
spaces by imposing regularity.Comment: Accepted to AAAI-2