193 research outputs found
Reasoning over Ontologies with Hidden Content: The Import-by-Query Approach
There is currently a growing interest in techniques for hiding parts of the
signature of an ontology Kh that is being reused by another ontology Kv.
Towards this goal, in this paper we propose the import-by-query framework,
which makes the content of Kh accessible through a limited query interface. If
Kv reuses the symbols from Kh in a certain restricted way, one can reason over
Kv U Kh by accessing only Kv and the query interface. We map out the landscape
of the import-by-query problem. In particular, we outline the limitations of
our framework and prove that certain restrictions on the expressivity of Kh and
the way in which Kv reuses symbols from Kh are strictly necessary to enable
reasoning in our setting. We also identify cases in which reasoning is possible
and we present suitable import-by-query reasoning algorithms
Goal-Driven Query Answering for Existential Rules with Equality
Inspired by the magic sets for Datalog, we present a novel goal-driven
approach for answering queries over terminating existential rules with equality
(aka TGDs and EGDs). Our technique improves the performance of query answering
by pruning the consequences that are not relevant for the query. This is
challenging in our setting because equalities can potentially affect all
predicates in a dataset. We address this problem by combining the existing
singularization technique with two new ingredients: an algorithm for
identifying the rules relevant to a query and a new magic sets algorithm. We
show empirically that our technique can significantly improve the performance
of query answering, and that it can mean the difference between answering a
query in a few seconds or not being able to process the query at all
Hypertableau Reasoning for Description Logics
We present a novel reasoning calculus for the description logic SHOIQ^+---a
knowledge representation formalism with applications in areas such as the
Semantic Web. Unnecessary nondeterminism and the construction of large models
are two primary sources of inefficiency in the tableau-based reasoning calculi
used in state-of-the-art reasoners. In order to reduce nondeterminism, we base
our calculus on hypertableau and hyperresolution calculi, which we extend with
a blocking condition to ensure termination. In order to reduce the size of the
constructed models, we introduce anywhere pairwise blocking. We also present an
improved nominal introduction rule that ensures termination in the presence of
nominals, inverse roles, and number restrictions---a combination of DL
constructs that has proven notoriously difficult to handle. Our implementation
shows significant performance improvements over state-of-the-art reasoners on
several well-known ontologies
Combining Rewriting and Incremental Materialisation Maintenance for Datalog Programs with Equality
Materialisation precomputes all consequences of a set of facts and a datalog
program so that queries can be evaluated directly (i.e., independently from the
program). Rewriting optimises materialisation for datalog programs with
equality by replacing all equal constants with a single representative; and
incremental maintenance algorithms can efficiently update a materialisation for
small changes in the input facts. Both techniques are critical to practical
applicability of datalog systems; however, we are unaware of an approach that
combines rewriting and incremental maintenance. In this paper we present the
first such combination, and we show empirically that it can speed up updates by
several orders of magnitude compared to using either rewriting or incremental
maintenance in isolation.Comment: All proofs contained in the appendix. 7 pages + 4 pages appendix. 7
algorithms and one table with evaluation result
Accurate sampling-based cardinality estimation for complex graph queries
Accurately estimating the cardinality (i.e., the number of answers) of complex queries plays a central role in
database systems. This problem is particularly difficult in graph databases, where queries often involve a large
number of joins and self-joins. Recently, Park et al. [54] surveyed seven state-of-the-art cardinality estimation
approaches for graph queries. The results of their extensive empirical evaluation show that a sampling method
based on the WanderJoin online aggregation algorithm [46] consistently offers superior accuracy.
We extended the framework by Park et al. [54] with three additional datasets and repeated their experiments.
Our results showed that WanderJoin is indeed very accurate, but it can often take a large number of samples
and thus be very slow. Moreover, when queries are complex and data distributions are skewed, it often fails
to find valid samples and estimates the cardinality as zero. Finally, complex graph queries often go beyond
simple graph matching and involve arbitrary nesting of relational operators such as disjunction, difference,
and duplicate elimination. Neither of the methods considered by Park et al. [54] is applicable to such queries.
In this paper we present a novel approach for estimating the cardinality of complex graph queries. Our
approach is inspired by WanderJoin, but, unlike all approaches known to us, it can process complex queries with
arbitrary operator nesting. Our estimator is strongly consistent, meaning that the average of repeated estimates
converges with probability one to the actual cardinality. We present optimisations of the basic algorithm
that aim to reduce the chance of producing zero estimates and improve accuracy. We show empirically that
our approach is both accurate and quick on complex queries and large datasets. Finally, we discuss how to
integrate our approach into a simple dynamic programming query planner, and we confirm empirically that
our planner produces high-quality plans that can significantly reduce end-to-end query evaluation times
Modular Materialisation of Datalog Programs
The semina\"ive algorithm can materialise all consequences of arbitrary
datalog rules, and it also forms the basis for incremental algorithms that
update a materialisation as the input facts change. Certain (combinations of)
rules, however, can be handled much more efficiently using custom algorithms.
To integrate such algorithms into a general reasoning approach that can handle
arbitrary rules, we propose a modular framework for materialisation computation
and its maintenance. We split a datalog program into modules that can be
handled using specialised algorithms, and handle the remaining rules using the
semina\"ive algorithm. We also present two algorithms for computing the
transitive and the symmetric-transitive closure of a relation that can be used
within our framework. Finally, we show empirically that our framework can
handle arbitrary datalog programs while outperforming existing approaches,
often by orders of magnitude.Comment: Accepted at AAAI 201
Stratified Negation in Limit Datalog Programs
There has recently been an increasing interest in declarative data analysis,
where analytic tasks are specified using a logical language, and their
implementation and optimisation are delegated to a general-purpose query
engine. Existing declarative languages for data analysis can be formalised as
variants of logic programming equipped with arithmetic function symbols and/or
aggregation, and are typically undecidable. In prior work, the language of
was proposed, which is sufficiently powerful to
capture many analysis tasks and has decidable entailment problem. Rules in this
language, however, do not allow for negation. In this paper, we study an
extension of limit programs with stratified negation-as-failure. We show that
the additional expressive power makes reasoning computationally more demanding,
and provide tight data complexity bounds. We also identify a fragment with
tractable data complexity and sufficient expressivity to capture many relevant
tasks.Comment: 14 pages; full version of a paper accepted at IJCAI-1
Stream Reasoning in Temporal Datalog
In recent years, there has been an increasing interest in extending
traditional stream processing engines with logical, rule-based, reasoning
capabilities. This poses significant theoretical and practical challenges since
rules can derive new information and propagate it both towards past and future
time points; as a result, streamed query answers can depend on data that has
not yet been received, as well as on data that arrived far in the past. Stream
reasoning algorithms, however, must be able to stream out query answers as soon
as possible, and can only keep a limited number of previous input facts in
memory. In this paper, we propose novel reasoning problems to deal with these
challenges, and study their computational properties on Datalog extended with a
temporal sort and the successor function (a core rule-based language for stream
reasoning applications)
- …