141 research outputs found
Equivalence-Invariant Algebraic Provenance for Hyperplane Update Queries
The algebraic approach for provenance tracking, originating in the semiring
model of Green et. al, has proven useful as an abstract way of handling
metadata. Commutative Semirings were shown to be the "correct" algebraic
structure for Union of Conjunctive Queries, in the sense that its use allows
provenance to be invariant under certain expected query equivalence axioms.
In this paper we present the first (to our knowledge) algebraic provenance
model, for a fragment of update queries, that is invariant under set
equivalence. The fragment that we focus on is that of hyperplane queries,
previously studied in multiple lines of work. Our algebraic provenance
structure and corresponding provenance-aware semantics are based on the sound
and complete axiomatization of Karabeg and Vianu. We demonstrate that our
construction can guide the design of concrete provenance model instances for
different applications. We further study the efficient generation and storage
of provenance for hyperplane update queries. We show that a naive algorithm can
lead to an exponentially large provenance expression, but remedy this by
presenting a normal form which we show may be efficiently computed alongside
query evaluation. We experimentally study the performance of our solution and
demonstrate its scalability and usefulness, and in particular the effectiveness
of our normal form representation
Provenance-aware knowledge representation: A survey of data models and contextualized knowledge graphs
Expressing machine-interpretable statements in the form of subject-predicate-object triples is a well-established practice for capturing semantics of structured data. However, the standard used for representing these triples, RDF, inherently lacks the mechanism to attach provenance data, which would be crucial to make automatically generated and/or processed data authoritative. This paper is a critical review of data models, annotation frameworks, knowledge organization systems, serialization syntaxes, and algebras that enable provenance-aware RDF statements. The various approaches are assessed in terms of standard compliance, formal semantics, tuple type, vocabulary term usage, blank nodes, provenance granularity, and scalability. This can be used to advance existing solutions and help implementers to select the most suitable approach (or a combination of approaches) for their applications. Moreover, the analysis of the mechanisms and their limitations highlighted in this paper can serve as the basis for novel approaches in RDF-powered applications with increasing provenance needs
The Vadalog System: Datalog-based Reasoning for Knowledge Graphs
Over the past years, there has been a resurgence of Datalog-based systems in
the database community as well as in industry. In this context, it has been
recognized that to handle the complex knowl\-edge-based scenarios encountered
today, such as reasoning over large knowledge graphs, Datalog has to be
extended with features such as existential quantification. Yet, Datalog-based
reasoning in the presence of existential quantification is in general
undecidable. Many efforts have been made to define decidable fragments. Warded
Datalog+/- is a very promising one, as it captures PTIME complexity while
allowing ontological reasoning. Yet so far, no implementation of Warded
Datalog+/- was available. In this paper we present the Vadalog system, a
Datalog-based system for performing complex logic reasoning tasks, such as
those required in advanced knowledge graphs. The Vadalog system is Oxford's
contribution to the VADA research programme, a joint effort of the universities
of Oxford, Manchester and Edinburgh and around 20 industrial partners. As the
main contribution of this paper, we illustrate the first implementation of
Warded Datalog+/-, a high-performance Datalog+/- system utilizing an aggressive
termination control strategy. We also provide a comprehensive experimental
evaluation.Comment: Extended version of VLDB paper
<https://doi.org/10.14778/3213880.3213888
Flexible Integration and Efficient Analysis of Multidimensional Datasets from the Web
If numeric data from the Web are brought together, natural scientists can compare climate measurements with estimations, financial analysts can evaluate companies based on balance sheets and daily stock market values, and citizens can explore the GDP per capita from several data sources. However, heterogeneities and size of data remain a problem. This work presents methods to query a uniform view - the Global Cube - of available datasets from the Web and builds on Linked Data query approaches
On expressibility of non-monotone operators in SPARQL
SPARQL, a query language for RDF graphs, is one of the key technologies for the Semantic Web. The expressivity and complexity of various fragments of SPARQL have been studied extensively. It is usually assumed that the optional matching operator OPTIONAL has only two graph patterns as arguments. The specification of SPARQL, however, defines it as a ternary operator, with an additional filter condition. We address the problem of expressibility of the full ternary OPTIONAL via the simplified binary version and show that it is possible, but only with an exponential blowup in the size of the query (under common complexity-theoretic assumptions). We also study expressibility of other non-monotone SPARQL operators via optional matching and each other
SOWL QL: Querying Spatio - Temporal Ontologies in OWL
We introduce SOWL QL, a query language for spatio-temporal information in ontologies. Buildingupon
SOWL (Spatio-Temporal OWL), an ontology for handling spatio-temporal information in OWL, SOWL QL supports querying over qualitative spatio-temporal information (expressed using natural language expressions such as “before”, “after”, “north of”, “south of”) rather than merely quantitative information (exact dates,
times, locations). SOWL QL extends SPARQL with a powerful set of temporal and spatial operators, including temporal Allen topological, spatial directional and topological operations or combinations of the above.
SOWL QL maintains simplicity of expression and also, upward and downward compatibility with SPARQL. Query translation in SOWL QL yields SPARQL queries implying that, querying spatio-temporal ontologies using SPARQL is still feasible but suffers from several drawbacks the most important of them being that, queries in SPARQL become particularly complicated and users must be familiar with the underlying spatio-temporal representation (the “N-ary relations” or the “4D-fluents” approach in this work). Finally, querying in SOWL QL is supported by the SOWL reasoner which is not part of the standard SPARQL translation. The run-time performance of SOWL QL has been assessed experimentally in a real data setting. A critical analysis of its performance is also presented
- …