4,414 research outputs found
Tile2Vec: Unsupervised representation learning for spatially distributed data
Geospatial analysis lacks methods like the word vector representations and
pre-trained networks that significantly boost performance across a wide range
of natural language and computer vision tasks. To fill this gap, we introduce
Tile2Vec, an unsupervised representation learning algorithm that extends the
distributional hypothesis from natural language -- words appearing in similar
contexts tend to have similar meanings -- to spatially distributed data. We
demonstrate empirically that Tile2Vec learns semantically meaningful
representations on three datasets. Our learned representations significantly
improve performance in downstream classification tasks and, similar to word
vectors, visual analogies can be obtained via simple arithmetic in the latent
space.Comment: 8 pages, 4 figures in main text; 9 pages, 11 figures in appendi
End-to-end Learning for Short Text Expansion
Effectively making sense of short texts is a critical task for many real
world applications such as search engines, social media services, and
recommender systems. The task is particularly challenging as a short text
contains very sparse information, often too sparse for a machine learning
algorithm to pick up useful signals. A common practice for analyzing short text
is to first expand it with external information, which is usually harvested
from a large collection of longer texts. In literature, short text expansion
has been done with all kinds of heuristics. We propose an end-to-end solution
that automatically learns how to expand short text to optimize a given learning
task. A novel deep memory network is proposed to automatically find relevant
information from a collection of longer documents and reformulate the short
text through a gating mechanism. Using short text classification as a
demonstrating task, we show that the deep memory network significantly
outperforms classical text expansion methods with comprehensive experiments on
real world data sets.Comment: KDD'201
- …