Search CORE

150 research outputs found

Prediction-hardness of acyclic conjunctive queries

Author: Hirata Kouichi
Publication venue: Elsevier B.V.
Publication date: 02/12/2005
Field of study

AbstractA conjunctive query problem is a problem to determine whether or not a tuple belongs to the answer of a conjunctive query over a database. In this paper, a tuple, a conjunctive query and a database in relational database theory are regarded as a ground atom, a nonrecursive function-free definite clause and a finite set of ground atoms, respectively, in inductive logic programming terminology. An acyclic conjunctive query problem is a conjunctive query problem with acyclicity. Concerned with the acyclic conjunctive query problem, in this paper, we present the hardness results of predicting acyclic conjunctive queries from an instance with a j-database of which predicate symbol is at most j-ary. Also we deal with two kinds of instances, a simple instance as a set of ground atoms and an extended instance as a set of pairs of a ground atom and a description. We mainly show that, from both a simple and an extended instances, acyclic conjunctive queries are not polynomial-time predictable with j-databases (j⩾3) under the cryptographic assumptions, and predicting acyclic conjunctive queries with 2-databases is as hard as predicting DNF formulas. Hence, the acyclic conjunctive queries become a natural example that the equivalence between subsumption-efficiency and efficient pac-learnability from both a simple and an extended instances collapses

Elsevier - Publisher Connector

Kyushu Institute of Technology of Academic Repository

Kyutacar : Kyushu Institute of Technology Academic Repository

On the non-efficient PAC learnability of conjunctive queries

Author: Funk M.
Jung J.C.
Lutz C.
ten Cate B.
Publication venue
Publication date: 01/01/2024
Field of study

This note serves three purposes: (i) we provide a self-contained exposition of the fact that conjunctive queries are not efficiently learnable in the Probably-Approximately-Correct (PAC) model, paying clear attention to the complicating fact that this concept class lacks the polynomial-size fitting property, a property that is tacitly assumed in much of the computational learning theory literature; (ii) we establish a strong negative PAC learnability result that applies to many restricted classes of conjunctive queries (CQs), including acyclic CQs for a wide range of notions of acyclicity; (iii) we show that CQs (and UCQs) are efficiently PAC learnable with membership queries.<p/

International Migration, Integration and Social Cohesion online publications

UvA-DARE

The Complexity of the Shapley Value for Regular Path Queries

Author: Khalil Majd
Kimelfeld Benny
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 26th International Conference on Database Theory (ICDT 2023)
Publication date: 01/01/2023
Field of study

A path query extracts vertex tuples from a labeled graph, based on the words that are formed by the paths connecting the vertices. We study the computational complexity of measuring the contribution of edges and vertices to an answer to a path query, focusing on the class of conjunctive regular path queries. To measure this contribution, we adopt the traditional Shapley value from cooperative game theory. This value has been recently proposed and studied in the context of relational database queries and has uses in a plethora of other domains. We first study the contribution of edges and show that the exact Shapley value is almost always hard to compute. Specifically, it is #P-hard to calculate the contribution of an edge whenever at least one (non-redundant) conjunct allows for a word of length three or more. In the case of regular path queries (i.e., no conjunction), the problem is tractable if the query has only words of length at most two; hence, this property fully characterizes the tractability of the problem. On the other hand, if we allow for an approximation error, then it is straightforward to obtain an efficient scheme (FPRAS) for an additive approximation. Yet, a multiplicative approximation is harder to obtain. We establish that in the case of conjunctive regular path queries, a multiplicative approximation of the Shapley value of an edge can be computed in polynomial time if and only if all query atoms are finite languages (assuming non-redundancy and conventional complexity limitations). We also study the analogous situation where we wish to determine the contribution of a vertex, rather than an edge, and establish complexity results of similar nature

Dagstuhl Research Online Publication Server

On Existential First Order Queries Inference on Knowledge Graphs

Author: Song Yangqiu
Wang Zihao
Yin Hang
Publication venue
Publication date: 14/04/2023
Field of study

Reasoning on knowledge graphs is a challenging task because it utilizes observed information to predict the missing one. Specifically, answering first-order logic formulas is of particular interest because of its clear syntax and semantics. Recently, the query embedding method has been proposed which learns the embedding of a set of entities and treats logic operations as set operations. Though there has been much research following the same methodology, it lacks a systematic inspection from the standpoint of logic. In this paper, we characterize the scope of queries investigated previously and precisely identify the gap between it and the whole family of existential formulas. Moreover, we develop a new dataset containing ten new formulas and discuss the new challenges coming simultaneously. Finally, we propose a new search algorithm from fuzzy logic theory which is capable of solving new formulas and outperforming the previous methods in existing formulas

arXiv.org e-Print Archive

Query Answering in Probabilistic Data and Knowledge Bases

Author: Ceylan Ismail Ilkan
Publication venue
Publication date: 29/11/2017
Field of study

Probabilistic data and knowledge bases are becoming increasingly important in academia and industry. They are continuously extended with new data, powered by modern information extraction tools that associate probabilities with knowledge base facts. The state of the art to store and process such data is founded on probabilistic database systems, which are widely and successfully employed. Beyond all the success stories, however, such systems still lack the fundamental machinery to convey some of the valuable knowledge hidden in them to the end user, which limits their potential applications in practice. In particular, in their classical form, such systems are typically based on strong, unrealistic limitations, such as the closed-world assumption, the closed-domain assumption, the tuple-independence assumption, and the lack of commonsense knowledge. These limitations do not only lead to unwanted consequences, but also put such systems on weak footing in important tasks, querying answering being a very central one. In this thesis, we enhance probabilistic data and knowledge bases with more realistic data models, thereby allowing for better means for querying them. Building on the long endeavor of unifying logic and probability, we develop different rigorous semantics for probabilistic data and knowledge bases, analyze their computational properties and identify sources of (in)tractability and design practical scalable query answering algorithms whenever possible. To achieve this, the current work brings together some recent paradigms from logics, probabilistic inference, and database theory

Technische Universität Dresden: Qucosa

Most Probable Explanations for Probabilistic Database Queries: Extended Version

Author: Borgwardt Stefan
Ceylan Ismail Ilkan
Lukasiewicz Thomas
Publication venue
Publication date: 28/12/2023
Field of study

Forming the foundations of large-scale knowledge bases, probabilistic databases have been widely studied in the literature. In particular, probabilistic query evaluation has been investigated intensively as a central inference mechanism. However, despite its power, query evaluation alone cannot extract all the relevant information encompassed in large-scale knowledge bases. To exploit this potential, we study two inference tasks; namely finding the most probable database and the most probable hypothesis for a given query. As natural counterparts of most probable explanations (MPE) and maximum a posteriori hypotheses (MAP) in probabilistic graphical models, they can be used in a variety of applications that involve prediction or diagnosis tasks. We investigate these problems relative to a variety of query languages, ranging from conjunctive queries to ontology-mediated queries, and provide a detailed complexity analysis

Qucosa

HSSS - Hochschulschriftenserver der SLUB

Technische Universität Dresden: Qucosa

Optimal Algorithms for Ranked Enumeration of Answers to Full Conjunctive Queries

Author: Ajwani Deepak
Gatterbauer Wolfgang
Riedewald Mirek
Tziavelis Nikolaos
Yang Xiaofeng
Publication venue
Publication date: 11/09/2020
Field of study

We study ranked enumeration of join-query results according to very general orders defined by selective dioids. Our main contribution is a framework for ranked enumeration over a class of dynamic programming problems that generalizes seemingly different problems that had been studied in isolation. To this end, we extend classic algorithms that find the k-shortest paths in a weighted graph. For full conjunctive queries, including cyclic ones, our approach is optimal in terms of the time to return the top result and the delay between results. These optimality properties are derived for the widely used notion of data complexity, which treats query size as a constant. By performing a careful cost analysis, we are able to uncover a previously unknown tradeoff between two incomparable enumeration approaches: one has lower complexity when the number of returned results is small, the other when the number is very large. We theoretically and empirically demonstrate the superiority of our techniques over batch algorithms, which produce the full result and then sort it. Our technique is not only faster for returning the first few results, but on some inputs beats the batch algorithm even when all results are produced.Comment: 50 pages, 19 figure

arXiv.org e-Print Archive

Research Repository UCD