3,335 research outputs found
Tile2Vec: Unsupervised representation learning for spatially distributed data
Geospatial analysis lacks methods like the word vector representations and
pre-trained networks that significantly boost performance across a wide range
of natural language and computer vision tasks. To fill this gap, we introduce
Tile2Vec, an unsupervised representation learning algorithm that extends the
distributional hypothesis from natural language -- words appearing in similar
contexts tend to have similar meanings -- to spatially distributed data. We
demonstrate empirically that Tile2Vec learns semantically meaningful
representations on three datasets. Our learned representations significantly
improve performance in downstream classification tasks and, similar to word
vectors, visual analogies can be obtained via simple arithmetic in the latent
space.Comment: 8 pages, 4 figures in main text; 9 pages, 11 figures in appendi
GraphMatch: Efficient Large-Scale Graph Construction for Structure from Motion
We present GraphMatch, an approximate yet efficient method for building the
matching graph for large-scale structure-from-motion (SfM) pipelines. Unlike
modern SfM pipelines that use vocabulary (Voc.) trees to quickly build the
matching graph and avoid a costly brute-force search of matching image pairs,
GraphMatch does not require an expensive offline pre-processing phase to
construct a Voc. tree. Instead, GraphMatch leverages two priors that can
predict which image pairs are likely to match, thereby making the matching
process for SfM much more efficient. The first is a score computed from the
distance between the Fisher vectors of any two images. The second prior is
based on the graph distance between vertices in the underlying matching graph.
GraphMatch combines these two priors into an iterative "sample-and-propagate"
scheme similar to the PatchMatch algorithm. Its sampling stage uses Fisher
similarity priors to guide the search for matching image pairs, while its
propagation stage explores neighbors of matched pairs to find new ones with a
high image similarity score. Our experiments show that GraphMatch finds the
most image pairs as compared to competing, approximate methods while at the
same time being the most efficient.Comment: Published at IEEE 3DV 201
A new BART prior for flexible modeling with categorical predictors
Default implementations of Bayesian Additive Regression Trees (BART)
represent categorical predictors using several binary indicators, one for each
level of each categorical predictor. Regression trees built with these
indicators partition the levels using a ``remove one a time strategy.''
Unfortunately, the vast majority of partitions of the levels cannot be built
with this strategy, severely limiting BART's ability to ``borrow strength''
across groups of levels. We overcome this limitation with a new class of
regression tree and a new decision rule prior that can assign multiple levels
to both the left and right child of a decision node. Motivated by spatial
applications with areal data, we introduce a further decision rule prior that
partitions the areas into spatially contiguous regions by deleting edges from
random spanning trees of a suitably defined network. We implemented our new
regression tree priors in the flexBART package, which, compared to existing
implementations, often yields improved out-of-sample predictive performance
without much additional computational burden. We demonstrate the efficacy of
flexBART using examples from baseball and the spatiotemporal modeling of crime.Comment: Software available at https://github.com/skdeshpande91/flexBAR
Metrics for Graph Comparison: A Practitioner's Guide
Comparison of graph structure is a ubiquitous task in data analysis and
machine learning, with diverse applications in fields such as neuroscience,
cyber security, social network analysis, and bioinformatics, among others.
Discovery and comparison of structures such as modular communities, rich clubs,
hubs, and trees in data in these fields yields insight into the generative
mechanisms and functional properties of the graph.
Often, two graphs are compared via a pairwise distance measure, with a small
distance indicating structural similarity and vice versa. Common choices
include spectral distances (also known as distances) and distances
based on node affinities. However, there has of yet been no comparative study
of the efficacy of these distance measures in discerning between common graph
topologies and different structural scales.
In this work, we compare commonly used graph metrics and distance measures,
and demonstrate their ability to discern between common topological features
found in both random graph models and empirical datasets. We put forward a
multi-scale picture of graph structure, in which the effect of global and local
structure upon the distance measures is considered. We make recommendations on
the applicability of different distance measures to empirical graph data
problem based on this multi-scale view. Finally, we introduce the Python
library NetComp which implements the graph distances used in this work
- …