30,290 research outputs found
Goal-Driven Query Answering for Existential Rules with Equality
Inspired by the magic sets for Datalog, we present a novel goal-driven
approach for answering queries over terminating existential rules with equality
(aka TGDs and EGDs). Our technique improves the performance of query answering
by pruning the consequences that are not relevant for the query. This is
challenging in our setting because equalities can potentially affect all
predicates in a dataset. We address this problem by combining the existing
singularization technique with two new ingredients: an algorithm for
identifying the rules relevant to a query and a new magic sets algorithm. We
show empirically that our technique can significantly improve the performance
of query answering, and that it can mean the difference between answering a
query in a few seconds or not being able to process the query at all
Combined FO rewritability for conjunctive query answering in DL-Lite
Standard description logic (DL) reasoning services such as satisfiability and subsumption mainly aim to support TBox design. When the design stage is over and the TBox is used in an actual application, it is usually combined with instance data stored in an ABox, and therefore query answering becomes the most importan
On the Evaluation of RDF Distribution Algorithms Implemented over Apache Spark
Querying very large RDF data sets in an efficient manner requires a
sophisticated distribution strategy. Several innovative solutions have recently
been proposed for optimizing data distribution with predefined query workloads.
This paper presents an in-depth analysis and experimental comparison of five
representative and complementary distribution approaches. For achieving fair
experimental results, we are using Apache Spark as a common parallel computing
framework by rewriting the concerned algorithms using the Spark API. Spark
provides guarantees in terms of fault tolerance, high availability and
scalability which are essential in such systems. Our different implementations
aim to highlight the fundamental implementation-independent characteristics of
each approach in terms of data preparation, load balancing, data replication
and to some extent to query answering cost and performance. The presented
measures are obtained by testing each system on one synthetic and one
real-world data set over query workloads with differing characteristics and
different partitioning constraints.Comment: 16 pages, 3 figure
- …