10,298 research outputs found
Simpler, faster and shorter labels for distances in graphs
We consider how to assign labels to any undirected graph with n nodes such
that, given the labels of two nodes and no other information regarding the
graph, it is possible to determine the distance between the two nodes. The
challenge in such a distance labeling scheme is primarily to minimize the
maximum label lenght and secondarily to minimize the time needed to answer
distance queries (decoding). Previous schemes have offered different trade-offs
between label lengths and query time. This paper presents a simple algorithm
with shorter labels and shorter query time than any previous solution, thereby
improving the state-of-the-art with respect to both label length and query time
in one single algorithm. Our solution addresses several open problems
concerning label length and decoding time and is the first improvement of label
length for more than three decades.
More specifically, we present a distance labeling scheme with label size (log
3)/2 + o(n) (logarithms are in base 2) and O(1) decoding time. This outperforms
all existing results with respect to both size and decoding time, including
Winkler's (Combinatorica 1983) decade-old result, which uses labels of size
(log 3)n and O(n/log n) decoding time, and Gavoille et al. (SODA'01), which
uses labels of size 11n + o(n) and O(loglog n) decoding time. In addition, our
algorithm is simpler than the previous ones. In the case of integral edge
weights of size at most W, we present almost matching upper and lower bounds
for label sizes. For r-additive approximation schemes, where distances can be
off by an additive constant r, we give both upper and lower bounds. In
particular, we present an upper bound for 1-additive approximation schemes
which, in the unweighted case, has the same size (ignoring second order terms)
as an adjacency scheme: n/2. We also give results for bipartite graphs and for
exact and 1-additive distance oracles
Sublinear Distance Labeling
A distance labeling scheme labels the nodes of a graph with binary
strings such that, given the labels of any two nodes, one can determine the
distance in the graph between the two nodes by looking only at the labels. A
-preserving distance labeling scheme only returns precise distances between
pairs of nodes that are at distance at least from each other. In this paper
we consider distance labeling schemes for the classical case of unweighted
graphs with both directed and undirected edges.
We present a bit -preserving distance labeling
scheme, improving the previous bound by Bollob\'as et. al. [SIAM J. Discrete
Math. 2005]. We also give an almost matching lower bound of
. With our -preserving distance labeling scheme as a
building block, we additionally achieve the following results:
1. We present the first distance labeling scheme of size for sparse
graphs (and hence bounded degree graphs). This addresses an open problem by
Gavoille et. al. [J. Algo. 2004], hereby separating the complexity from
distance labeling in general graphs which require bits, Moon [Proc.
of Glasgow Math. Association 1965].
2. For approximate -additive labeling schemes, that return distances
within an additive error of we show a scheme of size for .
This improves on the current best bound of by
Alstrup et. al. [SODA 2016] for sub-polynomial , and is a generalization of
a result by Gawrychowski et al. [arXiv preprint 2015] who showed this for
.Comment: A preliminary version of this paper appeared at ESA'1
Faster Shortest Paths in Dense Distance Graphs, with Applications
We show how to combine two techniques for efficiently computing shortest
paths in directed planar graphs. The first is the linear-time shortest-path
algorithm of Henzinger, Klein, Subramanian, and Rao [STOC'94]. The second is
Fakcharoenphol and Rao's algorithm [FOCS'01] for emulating Dijkstra's algorithm
on the dense distance graph (DDG). A DDG is defined for a decomposition of a
planar graph into regions of at most vertices each, for some parameter
. The vertex set of the DDG is the set of vertices
of that belong to more than one region (boundary vertices). The DDG has
arcs, such that distances in the DDG are equal to the distances in
. Fakcharoenphol and Rao's implementation of Dijkstra's algorithm on the DDG
(nicknamed FR-Dijkstra) runs in time, and is a
key component in many state-of-the-art planar graph algorithms for shortest
paths, minimum cuts, and maximum flows. By combining these two techniques we
remove the dependency in the running time of the shortest-path
algorithm, making it .
This work is part of a research agenda that aims to develop new techniques
that would lead to faster, possibly linear-time, algorithms for problems such
as minimum-cut, maximum-flow, and shortest paths with negative arc lengths. As
immediate applications, we show how to compute maximum flow in directed
weighted planar graphs in time, where is the minimum number
of edges on any path from the source to the sink. We also show how to compute
any part of the DDG that corresponds to a region with vertices and
boundary vertices in time, which is faster than has been
previously known for small values of
Distance labeling schemes for trees
We consider distance labeling schemes for trees: given a tree with nodes,
label the nodes with binary strings such that, given the labels of any two
nodes, one can determine, by looking only at the labels, the distance in the
tree between the two nodes.
A lower bound by Gavoille et. al. (J. Alg. 2004) and an upper bound by Peleg
(J. Graph Theory 2000) establish that labels must use
bits\footnote{Throughout this paper we use for .}. Gavoille et.
al. (ESA 2001) show that for very small approximate stretch, labels use
bits. Several other papers investigate various
variants such as, for example, small distances in trees (Alstrup et. al.,
SODA'03).
We improve the known upper and lower bounds of exact distance labeling by
showing that bits are needed and that bits are sufficient. We also give ()-stretch labeling
schemes using bits for constant .
()-stretch labeling schemes with polylogarithmic label size have
previously been established for doubling dimension graphs by Talwar (STOC
2004).
In addition, we present matching upper and lower bounds for distance labeling
for caterpillars, showing that labels must have size . For simple paths with nodes and edge weights in , we show that
labels must have size
Hardness of Exact Distance Queries in Sparse Graphs Through Hub Labeling
A distance labeling scheme is an assignment of bit-labels to the vertices of
an undirected, unweighted graph such that the distance between any pair of
vertices can be decoded solely from their labels. An important class of
distance labeling schemes is that of hub labelings, where a node
stores its distance to the so-called hubs , chosen so that for
any there is belonging to some shortest
path. Notice that for most existing graph classes, the best distance labelling
constructions existing use at some point a hub labeling scheme at least as a
key building block. Our interest lies in hub labelings of sparse graphs, i.e.,
those with , for which we show a lowerbound of
for the average size of the hubsets.
Additionally, we show a hub-labeling construction for sparse graphs of average
size for some , where is the
so-called Ruzsa-Szemer{\'e}di function, linked to structure of induced
matchings in dense graphs. This implies that further improving the lower bound
on hub labeling size to would require a
breakthrough in the study of lower bounds on , which have resisted
substantial improvement in the last 70 years. For general distance labeling of
sparse graphs, we show a lowerbound of , where is the communication complexity of the
Sum-Index problem over . Our results suggest that the best achievable
hub-label size and distance-label size in sparse graphs may be
for some
A simple yet effective baseline for non-attributed graph classification
Graphs are complex objects that do not lend themselves easily to typical
learning tasks. Recently, a range of approaches based on graph kernels or graph
neural networks have been developed for graph classification and for
representation learning on graphs in general. As the developed methodologies
become more sophisticated, it is important to understand which components of
the increasingly complex methods are necessary or most effective.
As a first step, we develop a simple yet meaningful graph representation, and
explore its effectiveness in graph classification. We test our baseline
representation for the graph classification task on a range of graph datasets.
Interestingly, this simple representation achieves similar performance as the
state-of-the-art graph kernels and graph neural networks for non-attributed
graph classification. Its performance on classifying attributed graphs is
slightly weaker as it does not incorporate attributes. However, given its
simplicity and efficiency, we believe that it still serves as an effective
baseline for attributed graph classification. Our graph representation is
efficient (linear-time) to compute. We also provide a simple connection with
the graph neural networks.
Note that these observations are only for the task of graph classification
while existing methods are often designed for a broader scope including node
embedding and link prediction. The results are also likely biased due to the
limited amount of benchmark datasets available. Nevertheless, the good
performance of our simple baseline calls for the development of new, more
comprehensive benchmark datasets so as to better evaluate and analyze different
graph learning methods. Furthermore, given the computational efficiency of our
graph summary, we believe that it is a good candidate as a baseline method for
future graph classification (or even other graph learning) studies.Comment: 13 pages. Shorter version appears at 2019 ICLR Workshop:
Representation Learning on Graphs and Manifolds. arXiv admin note: text
overlap with arXiv:1810.00826 by other author
Partition MCMC for inference on acyclic digraphs
Acyclic digraphs are the underlying representation of Bayesian networks, a
widely used class of probabilistic graphical models. Learning the underlying
graph from data is a way of gaining insights about the structural properties of
a domain. Structure learning forms one of the inference challenges of
statistical graphical models.
MCMC methods, notably structure MCMC, to sample graphs from the posterior
distribution given the data are probably the only viable option for Bayesian
model averaging. Score modularity and restrictions on the number of parents of
each node allow the graphs to be grouped into larger collections, which can be
scored as a whole to improve the chain's convergence. Current examples of
algorithms taking advantage of grouping are the biased order MCMC, which acts
on the alternative space of permuted triangular matrices, and non ergodic edge
reversal moves.
Here we propose a novel algorithm, which employs the underlying combinatorial
structure of DAGs to define a new grouping. As a result convergence is improved
compared to structure MCMC, while still retaining the property of producing an
unbiased sample. Finally the method can be combined with edge reversal moves to
improve the sampler further.Comment: Revised version. 34 pages, 16 figures. R code available at
https://github.com/annlia/partitionMCM
Transit Node Routing Reconsidered
Transit Node Routing (TNR) is a fast and exact distance oracle for road
networks. We show several new results for TNR. First, we give a surprisingly
simple implementation fully based on Contraction Hierarchies that speeds up
preprocessing by an order of magnitude approaching the time for just finding a
CH (which alone has two orders of magnitude larger query time). We also develop
a very effective purely graph theoretical locality filter without any
compromise in query times. Finally, we show that a specialization to the online
many-to-one (or one-to-many) shortest path further speeds up query time by an
order of magnitude. This variant even has better query time than the fastest
known previous methods which need much more space.Comment: 19 pages, submitted to SEA'201
- …