6 research outputs found
Improved Graph Indexing Algorithms for Label-Constrained Reachability Queries
Nowadays graph data have become absolutely ubiquitous in various applications starting from social/road networks to bio-medical data etc. Given such graph data, a reachability query asks if there exists a path from a source vertex to a target vertex in the graph. Due to its immense implications in both theory and applied domains, this query and many of its variants have been extensively studied in the literature. One such variant investigates the reachability between two vertices in an edge-labeled graph while constraining the label set simultaneously. This problem has recently been addressed by Valstar et al. [SIGMOD'17] who proposed an approach called the landmark indexing (LI) to support faster label-constrained reachability (LCR) queries. In this work, we introduce a simple, practical and space-e?cient solution for answering LCR queries even faster. The experimental evaluation shows signi?cant time and space e?ciency bene?ts of our proposed solution over the LI approach for this problem in both real-world and synthetic graphs
A+ Indexes: Tunable and Space-Efficient Adjacency Lists in Graph Database Management Systems
Graph database management systems (GDBMSs) are highly optimized to perform
fast traversals, i.e., joins of vertices with their neighbours, by indexing the
neighbourhoods of vertices in adjacency lists. However, existing GDBMSs have
system-specific and fixed adjacency list structures, which makes each system
efficient on only a fixed set of workloads. We describe a new tunable indexing
subsystem for GDBMSs, we call A+ indexes, with materialized view support. The
subsystem consists of two types of indexes: (i) vertex-partitioned indexes that
partition 1-hop materialized views into adjacency lists on either the source or
destination vertex IDs; and (ii) edge-partitioned indexes that partition 2-hop
views into adjacency lists on one of the edge IDs. As in existing GDBMSs, a
system by default requires one forward and one backward vertex-partitioned
index, which we call the primary A+ index. Users can tune the primary index or
secondary indexes by adding nested partitioning and sorting criteria. Our
secondary indexes are space-efficient and use a technique we call offset lists.
Our indexing subsystem allows a wider range of applications to benefit from
GDBMSs' fast join capabilities. We demonstrate the tunability and space
efficiency of A+ indexes through extensive experiments on three workloads
On the Evaluation of Pattern Match Queries in Large Graph Databases
Recently, graph databases have been received much attention in the research community due to their extensive applications in practice, such as social networks, biological networks and World Wide Web, which bring forth a lot of challenging data management problems including subgraph search, shortest-path query, reachability verification, pattern matching, and so on. Among them, the graph pattern matching is to find all matches in a data graph for a given pattern graph and is more general and flexible than other problems mentioned above. In this thesis, we address a kind of graph matching, the so-called pattern matching with δ, by which an edge in is allowed to match a path of length ≤ δ in . In order to reduce the search space when exploring to find matches, we propose a novel pruning algorithm to eliminate all unqualified vertices. We also propose a strategy to speed up the distance-based join over two lists of vertices. Extensive experiments have been conducted, which show that our approach makes great improvements in running time compared to existing ones.Master of Science in Applied Computer Scienc
Landmark indexing for evaluation of label-constrained reachability queries
Consider a directed edge-labeled graph, such as a social network or a citation network. A fundamental query on such data is to determine if there is a path in the graph from a given source vertex to a given target vertex, using only edges with labels in a restricted subset of the edge labels in the graph. Such label-constrained reachability (LCR) queries play an important role in graph analytics, for example, as a core fragment of the so-called regular path queries which are supported in practical graph query languages such as the W3C's SPARQL 1.1, Neo4j's Cypher, and Oracle's PGQL. Current solutions for LCR evaluation, however, do not scale to large graphs which are increasingly common in a broad range of application domains. In this paper we present the first practical solution for efficient LCR evaluation, leveraging landmark-based indexes for large graphs. We show through extensive experiments that our indexes are significantly smaller than state-of-the-art LCR indexing techniques, while supporting up to orders of magnitude faster query evaluation times. Our complete C++ codebase is available as open source for further research