4,800 research outputs found
Distributed Processing of Generalized Graph-Pattern Queries in SPARQL 1.1
We propose an efficient and scalable architecture for processing generalized
graph-pattern queries as they are specified by the current W3C recommendation
of the SPARQL 1.1 "Query Language" component. Specifically, the class of
queries we consider consists of sets of SPARQL triple patterns with labeled
property paths. From a relational perspective, this class resolves to
conjunctive queries of relational joins with additional graph-reachability
predicates. For the scalable, i.e., distributed, processing of this kind of
queries over very large RDF collections, we develop a suitable partitioning and
indexing scheme, which allows us to shard the RDF triples over an entire
cluster of compute nodes and to process an incoming SPARQL query over all of
the relevant graph partitions (and thus compute nodes) in parallel. Unlike most
prior works in this field, we specifically aim at the unified optimization and
distributed processing of queries consisting of both relational joins and
graph-reachability predicates. All communication among the compute nodes is
established via a proprietary, asynchronous communication protocol based on the
Message Passing Interface
An introduction to Graph Data Management
A graph database is a database where the data structures for the schema
and/or instances are modeled as a (labeled)(directed) graph or generalizations
of it, and where querying is expressed by graph-oriented operations and type
constructors. In this article we present the basic notions of graph databases,
give an historical overview of its main development, and study the main current
systems that implement them
Towards Efficient Path Query on Social Network with Hybrid RDF Management
The scalability and exibility of Resource Description Framework(RDF) model
make it ideally suited for representing online social networks(OSN). One basic
operation in OSN is to find chains of relations,such as k-Hop friends. Property
path query in SPARQL can express this type of operation, but its implementation
suffers from performance problem considering the ever growing data size and
complexity of OSN.In this paper, we present a main memory/disk based hybrid RDF
data management framework for efficient property path query. In this hybrid
framework, we realize an efficient in-memory algebra operator for property path
query using graph traversal, and estimate the cost of this operator to
cooperate with existing cost-based optimization. Experiments on benchmark and
real dataset demonstrated that our approach can achieve a good tradeoff between
data load expense and online query performance
Reasoning & Querying – State of the Art
Various query languages for Web and Semantic Web data, both for practical use and as an area of research in the scientific community, have emerged in recent years. At the same time, the broad adoption of the internet where keyword search is used in many applications, e.g. search engines, has familiarized casual users with using keyword queries to retrieve information on the internet. Unlike this easy-to-use querying, traditional query languages require knowledge of the language itself as well as of the data to be queried. Keyword-based query languages for XML and RDF bridge the gap between the two, aiming at enabling simple querying of semi-structured data, which is relevant e.g. in the context of the emerging Semantic Web. This article presents an overview of the field of keyword querying for XML and RDF
Web and Semantic Web Query Languages
A number of techniques have been developed to facilitate
powerful data retrieval on the Web and Semantic Web. Three categories
of Web query languages can be distinguished, according to the format
of the data they can retrieve: XML, RDF and Topic Maps. This article
introduces the spectrum of languages falling into these categories
and summarises their salient aspects. The languages are introduced using
common sample data and query types. Key aspects of the query
languages considered are stressed in a conclusion
A Selectivity based approach to Continuous Pattern Detection in Streaming Graphs
Cyber security is one of the most significant technical challenges in current
times. Detecting adversarial activities, prevention of theft of intellectual
properties and customer data is a high priority for corporations and government
agencies around the world. Cyber defenders need to analyze massive-scale,
high-resolution network flows to identify, categorize, and mitigate attacks
involving networks spanning institutional and national boundaries. Many of the
cyber attacks can be described as subgraph patterns, with prominent examples
being insider infiltrations (path queries), denial of service (parallel paths)
and malicious spreads (tree queries). This motivates us to explore subgraph
matching on streaming graphs in a continuous setting. The novelty of our work
lies in using the subgraph distributional statistics collected from the
streaming graph to determine the query processing strategy. We introduce a
"Lazy Search" algorithm where the search strategy is decided on a
vertex-to-vertex basis depending on the likelihood of a match in the vertex
neighborhood. We also propose a metric named "Relative Selectivity" that is
used to select between different query processing strategies. Our experiments
performed on real online news, network traffic stream and a synthetic social
network benchmark demonstrate 10-100x speedups over selectivity agnostic
approaches.Comment: in 18th International Conference on Extending Database Technology
(EDBT) (2015
- …