633 research outputs found
Magic Sets for Disjunctive Datalog Programs
In this paper, a new technique for the optimization of (partially) bound
queries over disjunctive Datalog programs with stratified negation is
presented. The technique exploits the propagation of query bindings and extends
the Magic Set (MS) optimization technique.
An important feature of disjunctive Datalog is nonmonotonicity, which calls
for nondeterministic implementations, such as backtracking search. A
distinguishing characteristic of the new method is that the optimization can be
exploited also during the nondeterministic phase. In particular, after some
assumptions have been made during the computation, parts of the program may
become irrelevant to a query under these assumptions. This allows for dynamic
pruning of the search space. In contrast, the effect of the previously defined
MS methods for disjunctive Datalog is limited to the deterministic portion of
the process. In this way, the potential performance gain by using the proposed
method can be exponential, as could be observed empirically.
The correctness of MS is established thanks to a strong relationship between
MS and unfounded sets that has not been studied in the literature before. This
knowledge allows for extending the method also to programs with stratified
negation in a natural way.
The proposed method has been implemented in DLV and various experiments have
been conducted. Experimental results on synthetic data confirm the utility of
MS for disjunctive Datalog, and they highlight the computational gain that may
be obtained by the new method w.r.t. the previously proposed MS methods for
disjunctive Datalog programs. Further experiments on real-world data show the
benefits of MS within an application scenario that has received considerable
attention in recent years, the problem of answering user queries over possibly
inconsistent databases originating from integration of autonomous sources of
information.Comment: 67 pages, 19 figures, preprint submitted to Artificial Intelligenc
The Vadalog System: Datalog-based Reasoning for Knowledge Graphs
Over the past years, there has been a resurgence of Datalog-based systems in
the database community as well as in industry. In this context, it has been
recognized that to handle the complex knowl\-edge-based scenarios encountered
today, such as reasoning over large knowledge graphs, Datalog has to be
extended with features such as existential quantification. Yet, Datalog-based
reasoning in the presence of existential quantification is in general
undecidable. Many efforts have been made to define decidable fragments. Warded
Datalog+/- is a very promising one, as it captures PTIME complexity while
allowing ontological reasoning. Yet so far, no implementation of Warded
Datalog+/- was available. In this paper we present the Vadalog system, a
Datalog-based system for performing complex logic reasoning tasks, such as
those required in advanced knowledge graphs. The Vadalog system is Oxford's
contribution to the VADA research programme, a joint effort of the universities
of Oxford, Manchester and Edinburgh and around 20 industrial partners. As the
main contribution of this paper, we illustrate the first implementation of
Warded Datalog+/-, a high-performance Datalog+/- system utilizing an aggressive
termination control strategy. We also provide a comprehensive experimental
evaluation.Comment: Extended version of VLDB paper
<https://doi.org/10.14778/3213880.3213888
Introducing Dynamic Behavior in Amalgamated Knowledge Bases
The problem of integrating knowledge from multiple and heterogeneous sources
is a fundamental issue in current information systems. In order to cope with
this problem, the concept of mediator has been introduced as a software
component providing intermediate services, linking data resources and
application programs, and making transparent the heterogeneity of the
underlying systems. In designing a mediator architecture, we believe that an
important aspect is the definition of a formal framework by which one is able
to model integration according to a declarative style. To this purpose, the use
of a logical approach seems very promising. Another important aspect is the
ability to model both static integration aspects, concerning query execution,
and dynamic ones, concerning data updates and their propagation among the
various data sources. Unfortunately, as far as we know, no formal proposals for
logically modeling mediator architectures both from a static and dynamic point
of view have already been developed. In this paper, we extend the framework for
amalgamated knowledge bases, presented by Subrahmanian, to deal with dynamic
aspects. The language we propose is based on the Active U-Datalog language, and
extends it with annotated logic and amalgamation concepts. We model the sources
of information and the mediator (also called supervisor) as Active U-Datalog
deductive databases, thus modeling queries, transactions, and active rules,
interpreted according to the PARK semantics. By using active rules, the system
can efficiently perform update propagation among different databases. The
result is a logical environment, integrating active and deductive rules, to
perform queries and update propagation in an heterogeneous mediated framework.Comment: Other Keywords: Deductive databases; Heterogeneous databases; Active
rules; Update
Ontology-based data access with databases: a short course
Ontology-based data access (OBDA) is regarded as a key ingredient of the new generation of information systems. In the OBDA paradigm, an ontology defines a high-level global schema of (already existing) data sources and provides a vocabulary for user queries. An OBDA system rewrites such queries and ontologies into the vocabulary of the data sources and then delegates the actual query evaluation to a suitable query answering system such as a relational database management system or a datalog engine. In this chapter, we mainly focus on OBDA with the ontology language OWL 2QL, one of the three profiles of the W3C standard Web Ontology Language OWL 2, and relational databases, although other possible languages will also be discussed. We consider different types of conjunctive query rewriting and their succinctness, different architectures of OBDA systems, and give an overview of the OBDA system Ontop
Four Lessons in Versatility or How Query Languages Adapt to the Web
Exposing not only human-centered information, but machine-processable data on the Web is one of the commonalities of recent Web trends. It has enabled a new kind of applications and businesses where the data is used in ways not foreseen by the data providers. Yet this exposition has fractured the Web into islands of data, each in different Web formats: Some providers choose XML, others RDF, again others JSON or OWL, for their data, even in similar domains. This fracturing stifles innovation as application builders have to cope not only with one Web stack (e.g., XML technology) but with several ones, each of considerable complexity. With Xcerpt we have developed a rule- and pattern based query language that aims to give shield application builders from much of this complexity: In a single query language XML and RDF data can be accessed, processed, combined, and re-published. Though the need for combined access to XML and RDF data has been recognized in previous work (including the W3C’s GRDDL), our approach differs in four main aspects: (1) We provide a single language (rather than two separate or embedded languages), thus minimizing the conceptual overhead of dealing with disparate data formats. (2) Both the declarative (logic-based) and the operational semantics are unified in that they apply for querying XML and RDF in the same way. (3) We show that the resulting query language can be implemented reusing traditional database technology, if desirable. Nevertheless, we also give a unified evaluation approach based on interval labelings of graphs that is at least as fast as existing approaches for tree-shaped XML data, yet provides linear time and space querying also for many RDF graphs. We believe that Web query languages are the right tool for declarative data access in Web applications and that Xcerpt is a significant step towards a more convenient, yet highly efficient data access in a “Web of Data”
- …