141 research outputs found

    Equivalence-Invariant Algebraic Provenance for Hyperplane Update Queries

    Get PDF
    The algebraic approach for provenance tracking, originating in the semiring model of Green et. al, has proven useful as an abstract way of handling metadata. Commutative Semirings were shown to be the "correct" algebraic structure for Union of Conjunctive Queries, in the sense that its use allows provenance to be invariant under certain expected query equivalence axioms. In this paper we present the first (to our knowledge) algebraic provenance model, for a fragment of update queries, that is invariant under set equivalence. The fragment that we focus on is that of hyperplane queries, previously studied in multiple lines of work. Our algebraic provenance structure and corresponding provenance-aware semantics are based on the sound and complete axiomatization of Karabeg and Vianu. We demonstrate that our construction can guide the design of concrete provenance model instances for different applications. We further study the efficient generation and storage of provenance for hyperplane update queries. We show that a naive algorithm can lead to an exponentially large provenance expression, but remedy this by presenting a normal form which we show may be efficiently computed alongside query evaluation. We experimentally study the performance of our solution and demonstrate its scalability and usefulness, and in particular the effectiveness of our normal form representation

    Provenance-aware knowledge representation: A survey of data models and contextualized knowledge graphs

    Get PDF
    Expressing machine-interpretable statements in the form of subject-predicate-object triples is a well-established practice for capturing semantics of structured data. However, the standard used for representing these triples, RDF, inherently lacks the mechanism to attach provenance data, which would be crucial to make automatically generated and/or processed data authoritative. This paper is a critical review of data models, annotation frameworks, knowledge organization systems, serialization syntaxes, and algebras that enable provenance-aware RDF statements. The various approaches are assessed in terms of standard compliance, formal semantics, tuple type, vocabulary term usage, blank nodes, provenance granularity, and scalability. This can be used to advance existing solutions and help implementers to select the most suitable approach (or a combination of approaches) for their applications. Moreover, the analysis of the mechanisms and their limitations highlighted in this paper can serve as the basis for novel approaches in RDF-powered applications with increasing provenance needs

    Provenance : from long-term preservation to query federation and grid reasoning

    Get PDF

    The Vadalog System: Datalog-based Reasoning for Knowledge Graphs

    Full text link
    Over the past years, there has been a resurgence of Datalog-based systems in the database community as well as in industry. In this context, it has been recognized that to handle the complex knowl\-edge-based scenarios encountered today, such as reasoning over large knowledge graphs, Datalog has to be extended with features such as existential quantification. Yet, Datalog-based reasoning in the presence of existential quantification is in general undecidable. Many efforts have been made to define decidable fragments. Warded Datalog+/- is a very promising one, as it captures PTIME complexity while allowing ontological reasoning. Yet so far, no implementation of Warded Datalog+/- was available. In this paper we present the Vadalog system, a Datalog-based system for performing complex logic reasoning tasks, such as those required in advanced knowledge graphs. The Vadalog system is Oxford's contribution to the VADA research programme, a joint effort of the universities of Oxford, Manchester and Edinburgh and around 20 industrial partners. As the main contribution of this paper, we illustrate the first implementation of Warded Datalog+/-, a high-performance Datalog+/- system utilizing an aggressive termination control strategy. We also provide a comprehensive experimental evaluation.Comment: Extended version of VLDB paper <https://doi.org/10.14778/3213880.3213888

    Flexible Integration and Efficient Analysis of Multidimensional Datasets from the Web

    Get PDF
    If numeric data from the Web are brought together, natural scientists can compare climate measurements with estimations, financial analysts can evaluate companies based on balance sheets and daily stock market values, and citizens can explore the GDP per capita from several data sources. However, heterogeneities and size of data remain a problem. This work presents methods to query a uniform view - the Global Cube - of available datasets from the Web and builds on Linked Data query approaches

    On expressibility of non-monotone operators in SPARQL

    Get PDF
    SPARQL, a query language for RDF graphs, is one of the key technologies for the Semantic Web. The expressivity and complexity of various fragments of SPARQL have been studied extensively. It is usually assumed that the optional matching operator OPTIONAL has only two graph patterns as arguments. The specification of SPARQL, however, defines it as a ternary operator, with an additional filter condition. We address the problem of expressibility of the full ternary OPTIONAL via the simplified binary version and show that it is possible, but only with an exponential blowup in the size of the query (under common complexity-theoretic assumptions). We also study expressibility of other non-monotone SPARQL operators via optional matching and each other

    SOWL QL: Querying Spatio - Temporal Ontologies in OWL

    Get PDF
    We introduce SOWL QL, a query language for spatio-temporal information in ontologies. Buildingupon SOWL (Spatio-Temporal OWL), an ontology for handling spatio-temporal information in OWL, SOWL QL supports querying over qualitative spatio-temporal information (expressed using natural language expressions such as “before”, “after”, “north of”, “south of”) rather than merely quantitative information (exact dates, times, locations). SOWL QL extends SPARQL with a powerful set of temporal and spatial operators, including temporal Allen topological, spatial directional and topological operations or combinations of the above. SOWL QL maintains simplicity of expression and also, upward and downward compatibility with SPARQL. Query translation in SOWL QL yields SPARQL queries implying that, querying spatio-temporal ontologies using SPARQL is still feasible but suffers from several drawbacks the most important of them being that, queries in SPARQL become particularly complicated and users must be familiar with the underlying spatio-temporal representation (the “N-ary relations” or the “4D-fluents” approach in this work). Finally, querying in SOWL QL is supported by the SOWL reasoner which is not part of the standard SPARQL translation. The run-time performance of SOWL QL has been assessed experimentally in a real data setting. A critical analysis of its performance is also presented
    • …
    corecore