2,930 research outputs found
Processing SPARQL Queries Over Distributed RDF Graphs
We propose techniques for processing SPARQL queries over a large RDF graph in
a distributed environment. We adopt a "partial evaluation and assembly"
framework. Answering a SPARQL query Q is equivalent to finding subgraph matches
of the query graph Q over RDF graph G. Based on properties of subgraph matching
over a distributed graph, we introduce local partial match as partial answers
in each fragment of RDF graph G. For assembly, we propose two methods:
centralized and distributed assembly. We analyze our algorithms from both
theoretically and experimentally. Extensive experiments over both real and
benchmark RDF repositories of billions of triples confirm that our method is
superior to the state-of-the-art methods in both the system's performance and
scalability.Comment: 30 page
Distributed Processing of Generalized Graph-Pattern Queries in SPARQL 1.1
We propose an efficient and scalable architecture for processing generalized
graph-pattern queries as they are specified by the current W3C recommendation
of the SPARQL 1.1 "Query Language" component. Specifically, the class of
queries we consider consists of sets of SPARQL triple patterns with labeled
property paths. From a relational perspective, this class resolves to
conjunctive queries of relational joins with additional graph-reachability
predicates. For the scalable, i.e., distributed, processing of this kind of
queries over very large RDF collections, we develop a suitable partitioning and
indexing scheme, which allows us to shard the RDF triples over an entire
cluster of compute nodes and to process an incoming SPARQL query over all of
the relevant graph partitions (and thus compute nodes) in parallel. Unlike most
prior works in this field, we specifically aim at the unified optimization and
distributed processing of queries consisting of both relational joins and
graph-reachability predicates. All communication among the compute nodes is
established via a proprietary, asynchronous communication protocol based on the
Message Passing Interface
An introduction to Graph Data Management
A graph database is a database where the data structures for the schema
and/or instances are modeled as a (labeled)(directed) graph or generalizations
of it, and where querying is expressed by graph-oriented operations and type
constructors. In this article we present the basic notions of graph databases,
give an historical overview of its main development, and study the main current
systems that implement them
- …