500 research outputs found

    Verification of Query Completeness over Processes [Extended Version]

    Full text link
    Data completeness is an essential aspect of data quality, and has in turn a huge impact on the effective management of companies. For example, statistics are computed and audits are conducted in companies by implicitly placing the strong assumption that the analysed data are complete. In this work, we are interested in studying the problem of completeness of data produced by business processes, to the aim of automatically assessing whether a given database query can be answered with complete information in a certain state of the process. We formalize so-called quality-aware processes that create data in the real world and store it in the company's information system possibly at a later point.Comment: Extended version of a paper that was submitted to BPM 201

    Query Containment for Highly Expressive Datalog Fragments

    Get PDF
    The containment problem of Datalog queries is well known to be undecidable. There are, however, several Datalog fragments for which containment is known to be decidable, most notably monadic Datalog and several "regular" query languages on graphs. Monadically Defined Queries (MQs) have been introduced recently as a joint generalization of these query languages. In this paper, we study a wide range of Datalog fragments with decidable query containment and determine exact complexity results for this problem. We generalize MQs to (Frontier-)Guarded Queries (GQs), and show that the containment problem is 3ExpTime-complete in either case, even if we allow arbitrary Datalog in the sub-query. If we focus on graph query languages, i.e., fragments of linear Datalog, then this complexity is reduced to 2ExpSpace. We also consider nested queries, which gain further expressivity by using predicates that are defined by inner queries. We show that nesting leads to an exponentially increasing hierarchy for the complexity of query containment, both in the linear and in the general case. Our results settle open problems for (nested) MQs, and they paint a comprehensive picture of the state of the art in Datalog query containment.Comment: 20 page

    Classification of annotation semirings over containment of conjunctive queries

    Get PDF
    Funding: This work is supported under SOCIAM: The Theory and Practice of Social Machines, a project funded by the UK Engineering and Physical Sciences Research Council (EPSRC) under grant number EP/J017728/1. This work was also supported by FET-Open Project FoX, grant agreement 233599; EPSRC grants EP/F028288/1, G049165 and J015377; and the Laboratory for Foundations of Computer Science.We study the problem of query containment of conjunctive queries over annotated databases. Annotations are typically attached to tuples and represent metadata, such as probability, multiplicity, comments, or provenance. It is usually assumed that annotations are drawn from a commutative semiring. Such databases pose new challenges in query optimization, since many related fundamental tasks, such as query containment, have to be reconsidered in the presence of propagation of annotations. We axiomatize several classes of semirings for each of which containment of conjunctive queries is equivalent to existence of a particular type of homomorphism. For each of these types, we also specify all semirings for which existence of a corresponding homomorphism is a sufficient (or necessary) condition for the containment. We develop new decision procedures for containment for some semirings which are not in any of these classes. This generalizes and systematizes previous approaches.PostprintPeer reviewe

    Provenance Semirings

    Get PDF
    We show that relational algebra calculations for incomplete databases, probabilistic databases, bag semantics and why provenance are particular cases of the same general algorithms involving semirings. This further suggests a comprehensive provenance representation that uses semirings of polynomials. We extend these considerations to datalog and semirings of formal power series. We give algorithms for datalog provenance calculation as well as datalog evaluation for incomplete and probabilistic databases. Finally, we show that for some semirings containment of conjunctive queries is the same as for standard set semantics

    Ronciling Differences

    Get PDF
    In this paper we study a problem motivated by the management of changes in databases. It turns out that several such change scenarios, e.g., the separately studied problems of view maintenance (propagation of data changes) and view adaptation (propagation of view definition changes) can be unified as instances of query reformulation using views provided that support for the relational difference operator exists in the context of query reformulation. Exact query reformulation using views in positive relational languages is well understood, and has a variety of applications in query optimization and data sharing. Unfortunately, most questions about queries become undecidable in the presence of difference (or negation), whether we use the foundational set semantics or the more practical bag semantics. We present a new way of managing this difficulty by defining a novel semantics, Z- relations, where tuples are annotated with positive or negative integers. Z-relations conveniently represent data, insertions, and deletions in a uniform way, and can apply deletions with the union operator (deletions are tuples with negative counts). We show that under Z-semantics relational algebra (R A) queries have a normal form consisting of a single difference of positive queries, and this leads to the decidability of their equivalence.We provide a sound and complete algorithm for reformulating R A queries, including queries with difference, over Z-relations. Additionally, we show how to support standard view maintenanc

    On Equivalence and Cores for Incomplete Databases in Open and Closed Worlds

    Get PDF
    Data exchange heavily relies on the notion of incomplete database instances. Several semantics for such instances have been proposed and include open (OWA), closed (CWA), and open-closed (OCWA) world. For all these semantics important questions are: whether one incomplete instance semantically implies another; when two are semantically equivalent; and whether a smaller or smallest semantically equivalent instance exists. For OWA and CWA these questions are fully answered. For several variants of OCWA, however, they remain open. In this work we adress these questions for Closed Powerset semantics and the OCWA semantics of [Leonid Libkin and Cristina Sirangelo, 2011]. We define a new OCWA semantics, called OCWA*, in terms of homomorphic covers that subsumes both semantics, and characterize semantic implication and equivalence in terms of such covers. This characterization yields a guess-and-check algorithm to decide equivalence, and shows that the problem is NP-complete. For the minimization problem we show that for several common notions of minimality there is in general no unique minimal equivalent instance for Closed Powerset semantics, and consequently not for the more expressive OCWA* either. However, for Closed Powerset semantics we show that one can find, for any incomplete database, a unique finite set of its subinstances which are subinstances (up to renaming of nulls) of all instances semantically equivalent to the original incomplete one. We study properties of this set, and extend the analysis to OCWA*

    An Annotation Management System for Relational Databases

    Get PDF
    • …
    corecore