43 research outputs found
Pruning based Distance Sketches with Provable Guarantees on Random Graphs
Measuring the distances between vertices on graphs is one of the most
fundamental components in network analysis. Since finding shortest paths
requires traversing the graph, it is challenging to obtain distance information
on large graphs very quickly. In this work, we present a preprocessing
algorithm that is able to create landmark based distance sketches efficiently,
with strong theoretical guarantees. When evaluated on a diverse set of social
and information networks, our algorithm significantly improves over existing
approaches by reducing the number of landmarks stored, preprocessing time, or
stretch of the estimated distances.
On Erd\"{o}s-R\'{e}nyi graphs and random power law graphs with degree
distribution exponent , our algorithm outputs an exact distance
data structure with space between and
depending on the value of , where is the number of vertices. We
complement the algorithm with tight lower bounds for Erdos-Renyi graphs and the
case when is close to two.Comment: Full version for the conference paper to appear in The Web
Conference'1
Hardness of Exact Distance Queries in Sparse Graphs Through Hub Labeling
A distance labeling scheme is an assignment of bit-labels to the vertices of
an undirected, unweighted graph such that the distance between any pair of
vertices can be decoded solely from their labels. An important class of
distance labeling schemes is that of hub labelings, where a node
stores its distance to the so-called hubs , chosen so that for
any there is belonging to some shortest
path. Notice that for most existing graph classes, the best distance labelling
constructions existing use at some point a hub labeling scheme at least as a
key building block. Our interest lies in hub labelings of sparse graphs, i.e.,
those with , for which we show a lowerbound of
for the average size of the hubsets.
Additionally, we show a hub-labeling construction for sparse graphs of average
size for some , where is the
so-called Ruzsa-Szemer{\'e}di function, linked to structure of induced
matchings in dense graphs. This implies that further improving the lower bound
on hub labeling size to would require a
breakthrough in the study of lower bounds on , which have resisted
substantial improvement in the last 70 years. For general distance labeling of
sparse graphs, we show a lowerbound of , where is the communication complexity of the
Sum-Index problem over . Our results suggest that the best achievable
hub-label size and distance-label size in sparse graphs may be
for some
Measuring Effectiveness of Address Schemes for AS-level Graphs
This dissertation presents measures of efficiency and locality for Internet addressing schemes.
Historically speaking, many issues, faced by the Internet, have been solved just in time, to make the Internet just work~\cite{justWork}. Consensus, however, has been reached that today\u27s Internet routing and addressing system is facing serious scaling problems: multi-homing which causes finer granularity of routing policies and finer control to realize various traffic engineering requirements, an increased demand for provider-independent prefix allocations which injects unaggregatable prefixes into the Default Free Zone (DFZ) routing table, and ever-increasing Internet user population and mobile edge devices. As a result, the DFZ routing table is again growing at an exponential rate.
Hierarchical, topology-based addressing has long been considered crucial to routing and forwarding scalability. Recently, however, a number of research efforts are considering alternatives to this traditional approach. With the goal of informing such research, we investigated the efficiency of address assignment in the existing (IPv4) Internet. In particular, we ask the question: ``how can we measure the locality of an address scheme given an input AS-level graph?\u27\u27
To do so, we first define a notion of efficiency or locality based on the average number of bit-hops required to advertize all prefixes in the Internet. In order to quantify how far from ``optimal the current Internet is, we assign prefixes to ASes ``from scratch in a manner that preserves observed semantics, using three increasingly strict definitions of equivalence.
Next we propose another metric that in some sense quantifies the ``efficiency of the labeling and is independent of forwarding/routing mechanisms. We validate the effectiveness of the metric by applying it to a series of address schemes with increasing randomness given an input AS-level graph. After that we apply the metric to the current Internet address scheme across years and compare the results with those of compact routing schemes
Beyond Highway Dimension: Small Distance Labels Using Tree Skeletons
International audienceThe goal of a hub-based distance labeling scheme for a network G = (V, E) is to assign a small subset S(u) ⊆ V to each node u ∈ V, in such a way that for any pair of nodes u, v, the intersection of hub sets S(u) ∩ S(v) contains a node on the shortest uv-path. The existence of small hub sets, and consequently efficient shortest path processing algorithms, for road networks is an empirical observation. A theoretical explanation for this phenomenon was proposed by Abraham et al. (SODA 2010) through a network parameter they called highway dimension, which captures the size of a hitting set for a collection of shortest paths of length at least r intersecting a given ball of radius 2r. In this work, we revisit this explanation, introducing a more tractable (and directly comparable) parameter based solely on the structure of shortest-path spanning trees, which we call skeleton dimension. We show that skeleton dimension admits an intuitive definition for both directed and undirected graphs, provides a way of computing labels more efficiently than by using highway dimension, and leads to comparable or stronger theoretical bounds on hub set size
Engineering Algorithms for Dynamic and Time-Dependent Route Planning
Efficiently computing shortest paths is an essential building block of many mobility applications, most prominently route planning/navigation devices and applications. In this thesis, we apply the algorithm engineering methodology to design algorithms for route planning in dynamic (for example, considering real-time traffic) and time-dependent (for example, considering traffic predictions) problem settings. We build on and extend the popular Contraction Hierarchies (CH) speedup technique. With a few minutes of preprocessing, CH can optimally answer shortest path queries on continental-sized road networks with tens of millions of vertices and edges in less than a millisecond, i.e. around four orders of magnitude faster than Dijkstra’s algorithm. CH already has been extended to dynamic and time-dependent problem settings. However, these adaptations suffer from limitations. For example, the time-dependent variant of CH exhibits prohibitive memory consumption on large road networks with detailed traffic predictions.
This thesis contains the following key contributions: First, we introduce CH-Potentials, an A*-based routing framework. CH-Potentials computes optimal distance estimates for A* using CH with a lower bound weight function derived at preprocessing time. The framework can be applied to any routing problem where appropriate lower bounds can be obtained. The achieved speedups range between one and three orders of magnitude over Dijkstra’s algorithm, depending on how tight the lower bounds are. Second, we propose several improvements to Customizable Contraction Hierarchies (CCH), the CH adaptation for dynamic route planning. Our improvements yield speedups of up to an order of magnitude. Further, we augment CCH to efficiently support essential extensions such as turn costs, alternative route computation and point-of-interest queries. Third, we present the first space-efficient, fast and exact speedup technique for time-dependent routing. Compared to the previous time-dependent variant of CH, our technique requires up to 40 times less memory, needs at most a third of the preprocessing time, and achieves only marginally slower query running times. Fourth, we generalize A* and introduce time-dependent A* potentials. This allows us to design the first approach for routing with combined live and predicted traffic, which achieves interactive running times for exact queries while allowing live traffic updates in a fraction of a minute. Fifth, we study extended problem models for routing with imperfect data and routing for truck drivers and present efficient algorithms for these variants. Sixth and finally, we present various complexity results for non-FIFO time-dependent routing and the extended problem models
Labeled Nearest Neighbor Search and Metric Spanners via Locality Sensitive Orderings
Chan, Har-Peled, and Jones [SICOMP 2020] developed locality-sensitive
orderings (LSO) for Euclidean space. A -LSO is a collection
of orderings such that for every there is an
ordering , where all the points between and w.r.t.
are in the -neighborhood of either or . In essence, LSO
allow one to reduce problems to the -dimensional line. Later, Filtser and Le
[STOC 2022] developed LSO's for doubling metrics, general metric spaces, and
minor free graphs.
For Euclidean and doubling spaces, the number of orderings in the LSO is
exponential in the dimension, which made them mainly useful for the low
dimensional regime. In this paper, we develop new LSO's for Euclidean,
, and doubling spaces that allow us to trade larger stretch for a much
smaller number of orderings. We then use our new LSO's (as well as the previous
ones) to construct path reporting low hop spanners, fault tolerant spanners,
reliable spanners, and light spanners for different metric spaces.
While many nearest neighbor search (NNS) data structures were constructed for
metric spaces with implicit distance representations (where the distance
between two metric points can be computed using their names, e.g. Euclidean
space), for other spaces almost nothing is known. In this paper we initiate the
study of the labeled NNS problem, where one is allowed to artificially assign
labels (short names) to metric points. We use LSO's to construct efficient
labeled NNS data structures in this model
Doctor of Philosophy
dissertationNetwork emulation has become an indispensable tool for the conduct of research in networking and distributed systems. It offers more realism than simulation and more control and repeatability than experimentation on a live network. However, emulation testbeds face a number of challenges, most prominently realism and scale. Because emulation allows the creation of arbitrary networks exhibiting a wide range of conditions, there is no guarantee that emulated topologies reflect real networks; the burden of selecting parameters to create a realistic environment is on the experimenter. While there are a number of techniques for measuring the end-to-end properties of real networks, directly importing such properties into an emulation has been a challenge. Similarly, while there exist numerous models for creating realistic network topologies, the lack of addresses on these generated topologies has been a barrier to using them in emulators. Once an experimenter obtains a suitable topology, that topology must be mapped onto the physical resources of the testbed so that it can be instantiated. A number of restrictions make this an interesting problem: testbeds typically have heterogeneous hardware, scarce resources which must be conserved, and bottlenecks that must not be overused. User requests for particular types of nodes or links must also be met. In light of these constraints, the network testbed mapping problem is NP-hard. Though the complexity of the problem increases rapidly with the size of the experimenter's topology and the size of the physical network, the runtime of the mapper must not; long mapping times can hinder the usability of the testbed. This dissertation makes three contributions towards improving realism and scale in emulation testbeds. First, it meets the need for realistic network conditions by creating Flexlab, a hybrid environment that couples an emulation testbed with a live-network testbed, inheriting strengths from each. Second, it attends to the need for realistic topologies by presenting a set of algorithms for automatically annotating generated topologies with realistic IP addresses. Third, it presents a mapper, assign, that is capable of assigning experimenters' requested topologies to testbeds' physical resources in a manner that scales well enough to handle large environments