Search CORE

35 research outputs found

Beyond Well-designed SPARQL

Author: Kaminski Mark
Kostylev Egor V.
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 19th International Conference on Database Theory (ICDT 2016)
Publication date: 01/01/2015
Field of study

SPARQL is the standard query language for RDF data. The distinctive feature of SPARQL is the OPTIONAL operator, which allows for partial answers when complete answers are not available due to lack of information. However, optional matching is computationally expensive - query answering is PSPACE-complete. The well-designed fragment of SPARQL achieves much better computational properties by restricting the use of optional matching - query answering becomes coNP-complete. However, well-designed SPARQL captures far from all real-life queries - in fact, only about half of the queries over DBpedia that use OPTIONAL are well-designed. In the present paper, we study queries outside of well-designed SPARQL. We introduce the class of weakly well-designed queries that subsumes well-designed queries and includes most common meaningful non-well-designed queries: our analysis shows that the new fragment captures about 99% of DBpedia queries with OPTIONAL. At the same time, query answering for weakly well-designed SPARQL remains coNP-complete, and our fragment is in a certain sense maximal for this complexity. We show that the fragment\u27s expressive power is strictly in-between well-designed and full SPARQL. Finally, we provide an intuitive normal form for weakly well-designed queries and study the complexity of containment and equivalence

Dagstuhl Research Online Publication Server

Oxford University Research Archive

CONSTRUCT Queries in SPARQL

Author: Kostylev Egor V.
Reutter Juan L.
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 18th International Conference on Database Theory (ICDT 2015)
Publication date: 01/01/2015
Field of study

SPARQL has become the most popular language for querying RDF datasets, the standard data model for representing information in the Web. This query language has received a good deal of attention in the last few years: two versions of W3C standards have been issued, several SPARQL query engines have been deployed, and important theoretical foundations have been laid. However, many fundamental aspects of SPARQL queries are not yet fully understood. To this end, it is crucial to understand the correspondence between SPARQL and well-developed frameworks like relational algebra or first order logic. But one of the main obstacles on the way to such understanding is the fact that the well-studied fragments of SPARQL do not produce RDF as output. In this paper we embark on the study of SPARQL CONSTRUCT queries, that is, queries which output RDF graphs. This class of queries takes rightful place in the standards and implementations, but contrary to SELECT queries, it has not yet attracted a worth-while theoretical research. Under this framework we are able to establish a strong connection between SPARQL and well-known logical and database formalisms. In particular, the fragment which does not allow for blank nodes in output templates corresponds to first order queries, its well-designed sub-fragment corresponds to positive first order queries, and the general language can be re-stated as a data exchange setting. These correspondences allow us to conclude that the general language is not composable, but the aforementioned blank-free fragments are. Finally, we enrich SPARQL with a recursion operator and establish fundamental properties of this extension

Dagstuhl Research Online Publication Server

Two Variable Logic with Ultimately Periodic Counting

Author: Benedikt Michael
Kostylev Egor V.
Tan Tony
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)
Publication date: 01/01/2020
Field of study

We consider the extension of FO² with quantifiers that state that the number of elements where a formula holds should belong to a given ultimately periodic set. We show that both satisfiability and finite satisfiability of the logic are decidable. We also show that the spectrum of any sentence is definable in Presburger arithmetic. In the process we present several refinements to the "biregular graph method". In this method, decidability issues concerning two-variable logics are reduced to questions about Presburger definability of integer vectors associated with partitioned graphs, where nodes in a partition satisfy certain constraints on their in- and out-degrees

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Oxford University Research Archive

Stratified Negation in Limit Datalog Programs

Author: Grau Bernardo Cuenca
Horrocks Ian
Kaminski Mark
Kostylev Egor V.
Motik Boris
Publication venue
Publication date: 25/04/2018
Field of study

There has recently been an increasing interest in declarative data analysis, where analytic tasks are specified using a logical language, and their implementation and optimisation are delegated to a general-purpose query engine. Existing declarative languages for data analysis can be formalised as variants of logic programming equipped with arithmetic function symbols and/or aggregation, and are typically undecidable. In prior work, the language of

\mathit{limit\ programs}

was proposed, which is sufficiently powerful to capture many analysis tasks and has decidable entailment problem. Rules in this language, however, do not allow for negation. In this paper, we study an extension of limit programs with stratified negation-as-failure. We show that the additional expressive power makes reasoning computationally more demanding, and provide tight data complexity bounds. We also identify a fragment with tractable data complexity and sufficient expressivity to capture many relevant tasks.Comment: 14 pages; full version of a paper accepted at IJCAI-1

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

Foundations of Declarative Data Analysis Using Limit Datalog Programs

Author: Grau Bernardo Cuenca
Horrocks Ian
Kaminski Mark
Kostylev Egor V.
Motik Boris
Publication venue
Publication date: 01/01/2017
Field of study

Motivated by applications in declarative data analysis, we study

\mathit{Datalog}_{\mathbb{Z}}

---an extension of positive Datalog with arithmetic functions over integers. This language is known to be undecidable, so we propose two fragments. In

\mathit{limit}~\mathit{Datalog}_{\mathbb{Z}}

predicates are axiomatised to keep minimal/maximal numeric values, allowing us to show that fact entailment is coNExpTime-complete in combined, and coNP-complete in data complexity. Moreover, an additional

\mathit{stability}

requirement causes the complexity to drop to ExpTime and PTime, respectively. Finally, we show that stable

\mathit{Datalog}_{\mathbb{Z}}

can express many useful data analysis tasks, and so our results provide a sound foundation for the development of advanced information systems.Comment: 23 pages; full version of a paper accepted at IJCAI-17; v2 fixes some typos and improves the acknowledgment

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

Classification of annotation semirings over containment of conjunctive queries

Author: Kostylev Egor V.
Reutter Juan L.
Salamon András Z.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 10/03/2022
Field of study

Funding: This work is supported under SOCIAM: The Theory and Practice of Social Machines, a project funded by the UK Engineering and Physical Sciences Research Council (EPSRC) under grant number EP/J017728/1. This work was also supported by FET-Open Project FoX, grant agreement 233599; EPSRC grants EP/F028288/1, G049165 and J015377; and the Laboratory for Foundations of Computer Science.We study the problem of query containment of conjunctive queries over annotated databases. Annotations are typically attached to tuples and represent metadata, such as probability, multiplicity, comments, or provenance. It is usually assumed that annotations are drawn from a commutative semiring. Such databases pose new challenges in query optimization, since many related fundamental tasks, such as query containment, have to be reconsidered in the presence of propagation of annotations. We axiomatize several classes of semirings for each of which containment of conjunctive queries is equivalent to existence of a particular type of homomorphism. For each of these types, we also specify all semirings for which existence of a corresponding homomorphism is a sufficient (or necessary) condition for the containment. We develop new decision procedures for containment for some semirings which are not in any of these classes. This generalizes and systematizes previous approaches.PostprintPeer reviewe

St Andrews Research Repository

GNNQ: a neuro-symbolic approach to query answering over incomplete knowledge graphs

Author: Kostylev Egor V
Pflueger Maximilian
Tena Cucala David J
Publication venue: Springer
Publication date: 16/10/2022
Field of study

Real-world knowledge graphs (KGs) are usually incomplete—that is, miss some facts representing valid information. So, when applied to such KGs, standard symbolic query engines fail to produce answers that are expected but not logically entailed by the KGs. To overcome this issue, state-of-the-art ML-based approaches first embed KGs and queries into a low-dimensional vector space, and then produce query answers based on the proximity of the candidate entity and the query embeddings in the embedding space. This allows embedding-based approaches to obtain expected answers that are not logically entailed. However, embedding-based approaches are not applicable in the inductive setting, where KG entities (i.e., constants) seen at runtime may differ from those seen during training. In this paper, we propose a novel neuro-symbolic approach to query answering over incomplete KGs applicable in the inductive setting. Our approach first symbolically augments the input KG with facts representing parts of the KG that match query fragments, and then applies a generalisation of the Relational Graph Convolutional Networks (RGCNs) to the augmented KG to produce the predicted query answers. We formally prove that, under reasonable assumptions, our approach can capture an approach based on vanilla RGCNs (and no KG augmentation) using a (often substantially) smaller number of layers. Finally, we empirically validate our theoretical findings by evaluating an implementation of our approach against the RGCN baseline on several dedicated benchmarks

Oxford University Research Archive

The Bag Semantics of Ontology-Based Data Access

Author: Grau Bernardo Cuenca
Horrocks Ian
Kaminski Mark
Konstantinidis George
Kostylev Egor V.
Nikolaou Charalampos
Publication venue
Publication date: 01/01/2017
Field of study

Ontology-based data access (OBDA) is a popular approach for integrating and querying multiple data sources by means of a shared ontology. The ontology is linked to the sources using mappings, which assign views over the data to ontology predicates. Motivated by the need for OBDA systems supporting database-style aggregate queries, we propose a bag semantics for OBDA, where duplicate tuples in the views defined by the mappings are retained, as is the case in standard databases. We show that bag semantics makes conjunctive query answering in OBDA coNP-hard in data complexity. To regain tractability, we consider a rather general class of queries and show its rewritability to a generalisation of the relational calculus to bags

arXiv.org e-Print Archive

Crossref

Southampton (e-Prints Soton)

Oxford University Research Archive