662 research outputs found
Structurally Tractable Uncertain Data
Many data management applications must deal with data which is uncertain,
incomplete, or noisy. However, on existing uncertain data representations, we
cannot tractably perform the important query evaluation tasks of determining
query possibility, certainty, or probability: these problems are hard on
arbitrary uncertain input instances. We thus ask whether we could restrict the
structure of uncertain data so as to guarantee the tractability of exact query
evaluation. We present our tractability results for tree and tree-like
uncertain data, and a vision for probabilistic rule reasoning. We also study
uncertainty about order, proposing a suitable representation, and study
uncertain data conditioned by additional observations.Comment: 11 pages, 1 figure, 1 table. To appear in SIGMOD/PODS PhD Symposium
201
On abstraction refinement for program analyses in Datalog
A central task for a program analysis concerns how to efficiently find a program abstraction that keeps only information relevant for proving properties of interest. We present a new approach for finding such abstractions for program analyses written in Datalog. Our approach is based on counterexample-guided abstraction refinement: when a Datalog analysis run fails using an abstraction, it seeks to generalize the cause of the failure to other abstractions, and pick a new abstraction that avoids a similar failure. Our solution uses a boolean satisfiability formulation that is general, complete, and optimal: it is independent of the Datalog solver, it generalizes the failure of an abstraction to as many other abstractions as possible, and it identifies the cheapest refined abstraction to try next. We show the performance of our approach on a pointer analysis and a typestate analysis, on eight real-world Java benchmark programs
Automatic generation of simplified weakest preconditions for integrity constraint verification
Given a constraint assumed to hold on a database and an update to
be performed on , we address the following question: will still hold
after is performed? When is a relational database, we define a
confluent terminating rewriting system which, starting from and ,
automatically derives a simplified weakest precondition such that,
whenever satisfies , then the updated database will satisfy
, and moreover is simplified in the sense that its computation
depends only upon the instances of that may be modified by the update. We
then extend the definition of a simplified to the case of deductive
databases; we prove it using fixpoint induction
Monadic Datalog Containment on Trees
We show that the query containment problem for monadic datalog on finite
unranked labeled trees can be solved in 2-fold exponential time when (a)
considering unordered trees using the axes child and descendant, and when (b)
considering ordered trees using the axes firstchild, nextsibling, child, and
descendant. When omitting the descendant-axis, we obtain that in both cases the
problem is EXPTIME-complete.Comment: This article is the full version of an article published in the
proccedings of the 8th Alberto Mendelzon Workshop (AMW 2014
Four Lessons in Versatility or How Query Languages Adapt to the Web
Exposing not only human-centered information, but machine-processable data on the Web is one of the commonalities of recent Web trends. It has enabled a new kind of applications and businesses where the data is used in ways not foreseen by the data providers. Yet this exposition has fractured the Web into islands of data, each in different Web formats: Some providers choose XML, others RDF, again others JSON or OWL, for their data, even in similar domains. This fracturing stifles innovation as application builders have to cope not only with one Web stack (e.g., XML technology) but with several ones, each of considerable complexity. With Xcerpt we have developed a rule- and pattern based query language that aims to give shield application builders from much of this complexity: In a single query language XML and RDF data can be accessed, processed, combined, and re-published. Though the need for combined access to XML and RDF data has been recognized in previous work (including the W3C’s GRDDL), our approach differs in four main aspects: (1) We provide a single language (rather than two separate or embedded languages), thus minimizing the conceptual overhead of dealing with disparate data formats. (2) Both the declarative (logic-based) and the operational semantics are unified in that they apply for querying XML and RDF in the same way. (3) We show that the resulting query language can be implemented reusing traditional database technology, if desirable. Nevertheless, we also give a unified evaluation approach based on interval labelings of graphs that is at least as fast as existing approaches for tree-shaped XML data, yet provides linear time and space querying also for many RDF graphs. We believe that Web query languages are the right tool for declarative data access in Web applications and that Xcerpt is a significant step towards a more convenient, yet highly efficient data access in a “Web of Data”
Inductive Logic Programming in Databases: from Datalog to DL+log
In this paper we address an issue that has been brought to the attention of
the database community with the advent of the Semantic Web, i.e. the issue of
how ontologies (and semantics conveyed by them) can help solving typical
database problems, through a better understanding of KR aspects related to
databases. In particular, we investigate this issue from the ILP perspective by
considering two database problems, (i) the definition of views and (ii) the
definition of constraints, for a database whose schema is represented also by
means of an ontology. Both can be reformulated as ILP problems and can benefit
from the expressive and deductive power of the KR framework DL+log. We
illustrate the application scenarios by means of examples. Keywords: Inductive
Logic Programming, Relational Databases, Ontologies, Description Logics, Hybrid
Knowledge Representation and Reasoning Systems. Note: To appear in Theory and
Practice of Logic Programming (TPLP).Comment: 30 pages, 3 figures, 2 tables
Magic Sets for Disjunctive Datalog Programs
In this paper, a new technique for the optimization of (partially) bound
queries over disjunctive Datalog programs with stratified negation is
presented. The technique exploits the propagation of query bindings and extends
the Magic Set (MS) optimization technique.
An important feature of disjunctive Datalog is nonmonotonicity, which calls
for nondeterministic implementations, such as backtracking search. A
distinguishing characteristic of the new method is that the optimization can be
exploited also during the nondeterministic phase. In particular, after some
assumptions have been made during the computation, parts of the program may
become irrelevant to a query under these assumptions. This allows for dynamic
pruning of the search space. In contrast, the effect of the previously defined
MS methods for disjunctive Datalog is limited to the deterministic portion of
the process. In this way, the potential performance gain by using the proposed
method can be exponential, as could be observed empirically.
The correctness of MS is established thanks to a strong relationship between
MS and unfounded sets that has not been studied in the literature before. This
knowledge allows for extending the method also to programs with stratified
negation in a natural way.
The proposed method has been implemented in DLV and various experiments have
been conducted. Experimental results on synthetic data confirm the utility of
MS for disjunctive Datalog, and they highlight the computational gain that may
be obtained by the new method w.r.t. the previously proposed MS methods for
disjunctive Datalog programs. Further experiments on real-world data show the
benefits of MS within an application scenario that has received considerable
attention in recent years, the problem of answering user queries over possibly
inconsistent databases originating from integration of autonomous sources of
information.Comment: 67 pages, 19 figures, preprint submitted to Artificial Intelligenc
Query Stability in Monotonic Data-Aware Business Processes [Extended Version]
Organizations continuously accumulate data, often according to some business
processes. If one poses a query over such data for decision support, it is
important to know whether the query is stable, that is, whether the answers
will stay the same or may change in the future because business processes may
add further data. We investigate query stability for conjunctive queries. To
this end, we define a formalism that combines an explicit representation of the
control flow of a process with a specification of how data is read and inserted
into the database. We consider different restrictions of the process model and
the state of the system, such as negation in conditions, cyclic executions,
read access to written data, presence of pending process instances, and the
possibility to start fresh process instances. We identify for which facet
combinations stability of conjunctive queries is decidable and provide
encodings into variants of Datalog that are optimal with respect to the
worst-case complexity of the problem.Comment: This report is the extended version of a paper accepted at the 19th
International Conference on Database Theory (ICDT 2016), March 15-18, 2016 -
Bordeaux, Franc
- …