1,318 research outputs found

    Distance-generalized Core Decomposition

    Full text link
    The kk-core of a graph is defined as the maximal subgraph in which every vertex is connected to at least kk other vertices within that subgraph. In this work we introduce a distance-based generalization of the notion of kk-core, which we refer to as the (k,h)(k,h)-core, i.e., the maximal subgraph in which every vertex has at least kk other vertices at distance ≤h\leq h within that subgraph. We study the properties of the (k,h)(k,h)-core showing that it preserves many of the nice features of the classic core decomposition (e.g., its connection with the notion of distance-generalized chromatic number) and it preserves its usefulness to speed-up or approximate distance-generalized notions of dense structures, such as hh-club. Computing the distance-generalized core decomposition over large networks is intrinsically complex. However, by exploiting clever upper and lower bounds we can partition the computation in a set of totally independent subcomputations, opening the door to top-down exploration and to multithreading, and thus achieving an efficient algorithm

    Revisiting Radius, Diameter, and all Eccentricity Computation in Graphs through Certificates

    Get PDF
    We introduce notions of certificates allowing to bound eccentricities in a graph. In particular , we revisit radius (minimum eccentricity) and diameter (maximum eccentricity) computation and explain the efficiency of practical radius and diameter algorithms by the existence of small certificates for radius and diameter plus few additional properties. We show how such computation is related to covering a graph with certain balls or complementary of balls. We introduce several new algorithmic techniques related to eccentricity computation and propose algorithms for radius, diameter and all eccentricities with theoretical guarantees with respect to certain graph parameters. This is complemented by experimental results on various real-world graphs showing that these parameters appear to be low in practice. We also obtain refined results in the case where the input graph has low doubling dimension, has low hyperbolicity, or is chordal

    Distributed Data Summarization in Well-Connected Networks

    Get PDF
    We study distributed algorithms for some fundamental problems in data summarization. Given a communication graph G of n nodes each of which may hold a value initially, we focus on computing sum_{i=1}^N g(f_i), where f_i is the number of occurrences of value i and g is some fixed function. This includes important statistics such as the number of distinct elements, frequency moments, and the empirical entropy of the data. In the CONGEST~ model, a simple adaptation from streaming lower bounds shows that it requires Omega~(D+ n) rounds, where D is the diameter of the graph, to compute some of these statistics exactly. However, these lower bounds do not hold for graphs that are well-connected. We give an algorithm that computes sum_{i=1}^{N} g(f_i) exactly in {tau_{G}} * 2^{O(sqrt{log n})} rounds where {tau_{G}} is the mixing time of G. This also has applications in computing the top k most frequent elements. We demonstrate that there is a high similarity between the GOSSIP~ model and the CONGEST~ model in well-connected graphs. In particular, we show that each round of the GOSSIP~ model can be simulated almost perfectly in O~({tau_{G}}) rounds of the CONGEST~ model. To this end, we develop a new algorithm for the GOSSIP~ model that 1 +/- epsilon approximates the p-th frequency moment F_p = sum_{i=1}^N f_i^p in O~(epsilon^{-2} n^{1-k/p}) roundsfor p >= 2, when the number of distinct elements F_0 is at most O(n^{1/(k-1)}). This result can be translated back to the CONGEST~ model with a factor O~({tau_{G}}) blow-up in the number of rounds

    Algorithms for Landmark Hub Labeling

    Get PDF
    Landmark-based routing and Hub Labeling (HL) are shortest path planning techniques, both of which rely on storing shortest path distances between selected pairs of nodes in a preprocessing phase to accelerate query answering. In Landmark-based routing, stored distances to landmark nodes are used to obtain distance lower bounds that guide A* search from node s to node t. With HL, tight upper bounds for shortest path distances between any s-t-pair can be interfered from their stored node labels, making HL an efficient distance oracle. However, for shortest path retrieval, the oracle has to be called once per edge in said path. Furthermore, HL often suffers from a large space consumption as many node pair distances have to be stored in the labels to allow for correct query answering. In this paper, we propose a novel technique, called Landmark Hub Labeling (LHL), which integrates the landmark concept into HL. We prove better worst-case path retrieval times for LHL in case it is path-consistent (a new labeling property we introduce). Moreover, we design efficient (approximation) algorithms that produce path-consistent LHL with small label size and provide parametrized upper bounds, depending on the highway dimension h or the geodesic transversal number gt of the graph. Finally, we show that the space consumption of LHL is smaller than that of (hierarchical) HL, both in theory and in experiments on real-world road networks

    High-Performance Reachability Query Processing under Index Size Restrictions

    Full text link
    In this paper, we propose a scalable and highly efficient index structure for the reachability problem over graphs. We build on the well-known node interval labeling scheme where the set of vertices reachable from a particular node is compactly encoded as a collection of node identifier ranges. We impose an explicit bound on the size of the index and flexibly assign approximate reachability ranges to nodes of the graph such that the number of index probes to answer a query is minimized. The resulting tunable index structure generates a better range labeling if the space budget is increased, thus providing a direct control over the trade off between index size and the query processing performance. By using a fast recursive querying method in conjunction with our index structure, we show that in practice, reachability queries can be answered in the order of microseconds on an off-the-shelf computer - even for the case of massive-scale real world graphs. Our claims are supported by an extensive set of experimental results using a multitude of benchmark and real-world web-scale graph datasets.Comment: 30 page
    • …
    corecore