113 research outputs found

    TopCom: Index for Shortest Distance Query in Directed Graph

    Get PDF
    Finding shortest distance between two vertices in a graph is an important problem due to its numerous applications in diverse domains, including geo-spatial databases, social network analysis, and information retrieval. Classical algorithms (such as, Dijkstra) solve this problem in polynomial time, but these algorithms cannot provide real-time response for a large number of bursty queries on a large graph. So, indexing based solutions that pre-process the graph for efficiently answering (exactly or approximately) a large number of distance queries in real-time is becoming increasingly popular. Existing solutions have varying performance in terms of index size, index building time, query time, and accuracy. In this work, we propose T OP C OM , a novel indexing-based solution for exactly answering distance queries. Our experiments with two of the existing state-of-the-art methods (IS-Label and TreeMap) show the superiority of T OP C OM over these two methods considering scalability and query time. Besides, indexing of T OP C OM exploits the DAG (directed acyclic graph) structure in the graph, which makes it significantly faster than the existing methods if the SCCs (strongly connected component) of the input graph are relatively small

    Scaling distance labeling on small-world networks

    Full text link
    © 2019 Association for Computing Machinery. Distance labeling approaches are widely adopted to speed up the online performance of shortest distance queries. The construction of the distance labeling, however, can be exhaustive especially on big graphs. For a major category of large graphs, small-world networks, the state-of-the-art approach is Pruned Landmark Labeling (PLL). PLL prunes distance labels based on a node order and directly constructs the pruned labels by performing breadth-first searches in the node order. The pruning technique, as well as the index construction, has a strong sequential nature which hinders PLL from being parallelized. It becomes an urgent issue on massive small-world networks whose index can hardly be constructed by a single thread within a reasonable time. This paper scales distance labeling on small-world networks by proposing a Parallel Shortest-distance Labeling (PSL) scheme and further reducing the index size by exploiting graph and label properties. PSL insightfully converts the PLL's node-order dependency to a shortest-distance dependence, which leads to a propagation-based parallel labeling in D rounds where D denotes the diameter of the graph. Extensive experimental results verify our efficiency on billion-scale graphs and near-linear speedup in a multi-core environment

    Efficiently Answering Quality Constrained Shortest Distance Queries in Large Graphs

    Full text link
    The shortest-path distance is a fundamental concept in graph data analytics and has been extensively studied in literature. In many real-world applications, quality constraints are naturally associated with edges in the graph, and finding the shortest distance between vertices along only valid edges (i.e., edges that satisfy a given quality constraint) is also critical. In this work, we investigate this novel and important problem of quality constraint shortest distance queries. We propose an efficient index structure based on 2-hop labeling approaches. Supported by a path dominance relationship incorporating both quality and length information, we demonstrate the minimal property of the new index. An efficient query processing algorithm is also developed. Extensive experimental studies over real-life datasets demonstrates efficiency and effectiveness of our techniques

    Top-k Route Search through Submodularity Modeling of Recurrent POI Features

    Full text link
    We consider a practical top-k route search problem: given a collection of points of interest (POIs) with rated features and traveling costs between POIs, a user wants to find k routes from a source to a destination and limited in a cost budget, that maximally match her needs on feature preferences. One challenge is dealing with the personalized diversity requirement where users have various trade-off between quantity (the number of POIs with a specified feature) and variety (the coverage of specified features). Another challenge is the large scale of the POI map and the great many alternative routes to search. We model the personalized diversity requirement by the whole class of submodular functions, and present an optimal solution to the top-k route search problem through indices for retrieving relevant POIs in both feature and route spaces and various strategies for pruning the search space using user preferences and constraints. We also present promising heuristic solutions and evaluate all the solutions on real life data.Comment: 11 pages, 7 figures, 2 table

    Pruning based Distance Sketches with Provable Guarantees on Random Graphs

    Full text link
    Measuring the distances between vertices on graphs is one of the most fundamental components in network analysis. Since finding shortest paths requires traversing the graph, it is challenging to obtain distance information on large graphs very quickly. In this work, we present a preprocessing algorithm that is able to create landmark based distance sketches efficiently, with strong theoretical guarantees. When evaluated on a diverse set of social and information networks, our algorithm significantly improves over existing approaches by reducing the number of landmarks stored, preprocessing time, or stretch of the estimated distances. On Erd\"{o}s-R\'{e}nyi graphs and random power law graphs with degree distribution exponent 2<β<32 < \beta < 3, our algorithm outputs an exact distance data structure with space between Θ(n5/4)\Theta(n^{5/4}) and Θ(n3/2)\Theta(n^{3/2}) depending on the value of β\beta, where nn is the number of vertices. We complement the algorithm with tight lower bounds for Erdos-Renyi graphs and the case when β\beta is close to two.Comment: Full version for the conference paper to appear in The Web Conference'1

    Efficiently answering reachability and path queries on temporal bipartite graphs

    Full text link
    Bipartite graphs are naturally used to model relationships between two different types of entities, such as people-location, authorpaper, and customer-product. When modeling real-world applications like disease outbreaks, edges are often enriched with temporal information, leading to temporal bipartite graphs. While reachability has been extensively studied on (temporal) unipartite graphs, it remains largely unexplored on temporal bipartite graphs. To fill this research gap, in this paper, we study the reachability problem on temporal bipartite graphs. Specifically, a vertex u reaches a vertex w in a temporal bipartite graph G if u and w are connected through a series of consecutive wedges with time constraints. Towards efficiently answering if a vertex can reach the other vertex, we propose an index-based method by adapting the idea of 2-hop labeling. Effective optimization strategies and parallelization techniques are devised to accelerate the index construction process. To better support real-life scenarios, we further show how the index is leveraged to efficiently answer other types of queries, e.g., singlesource reachability query and earliest-arrival path query. Extensive experiments on 16 real-world graphs demonstrate the effectiveness and efficiency of our proposed techniques
    • …
    corecore