Search CORE

19,676 research outputs found

Zero-Shot Learning by Convex Combination of Semantic Embeddings

Author: Bengio Samy
Corrado Greg S.
Dean Jeffrey
Frome Andrea
Mikolov Tomas
Norouzi Mohammad
Shlens Jonathon
Singer Yoram
Publication venue
Publication date: 21/03/2014
Field of study

Several recent publications have proposed methods for mapping images into continuous semantic embedding spaces. In some cases the embedding space is trained jointly with the image transformation. In other cases the semantic embedding space is established by an independent natural language processing task, and then the image transformation into that space is learned in a second stage. Proponents of these image embedding systems have stressed their advantages over the traditional \nway{} classification framing of image understanding, particularly in terms of the promise for zero-shot learning -- the ability to correctly annotate images of previously unseen object categories. In this paper, we propose a simple method for constructing an image embedding system from any existing \nway{} image classifier and a semantic word embedding model, which contains the \n class labels in its vocabulary. Our method maps images into the semantic embedding space via convex combination of the class label embedding vectors, and requires no additional training. We show that this simple and direct method confers many of the advantages associated with more complex image embedding schemes, and indeed outperforms state of the art methods on the ImageNet zero-shot learning task

arXiv.org e-Print Archive

CiteSeerX

Tile2Vec: Unsupervised representation learning for spatially distributed data

Author: Azzari George
Ermon Stefano
Jean Neal
Lobell David
Samar Anshul
Wang Sherrie
Publication venue
Publication date: 30/05/2018
Field of study

Geospatial analysis lacks methods like the word vector representations and pre-trained networks that significantly boost performance across a wide range of natural language and computer vision tasks. To fill this gap, we introduce Tile2Vec, an unsupervised representation learning algorithm that extends the distributional hypothesis from natural language -- words appearing in similar contexts tend to have similar meanings -- to spatially distributed data. We demonstrate empirically that Tile2Vec learns semantically meaningful representations on three datasets. Our learned representations significantly improve performance in downstream classification tasks and, similar to word vectors, visual analogies can be obtained via simple arithmetic in the latent space.Comment: 8 pages, 4 figures in main text; 9 pages, 11 figures in appendi

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications