7 research outputs found
Evaluating Datalog via Tree Automata and Cycluits
We investigate parameterizations of both database instances and queries that
make query evaluation fixed-parameter tractable in combined complexity. We show
that clique-frontier-guarded Datalog with stratified negation (CFG-Datalog)
enjoys bilinear-time evaluation on structures of bounded treewidth for programs
of bounded rule size. Such programs capture in particular conjunctive queries
with simplicial decompositions of bounded width, guarded negation fragment
queries of bounded CQ-rank, or two-way regular path queries. Our result is
shown by translating to alternating two-way automata, whose semantics is
defined via cyclic provenance circuits (cycluits) that can be tractably
evaluated.Comment: 56 pages, 63 references. Journal version of "Combined Tractability of
Query Evaluation via Tree Automata and Cycluits (Extended Version)" at
arXiv:1612.04203. Up to the stylesheet, page/environment numbering, and
possible minor publisher-induced changes, this is the exact content of the
journal paper that will appear in Theory of Computing Systems. Update wrt
version 1: latest reviewer feedbac
Conjunctive Queries on Probabilistic Graphs: Combined Complexity
International audienceQuery evaluation over probabilistic databases is known to be intractable in many cases, even in data complexity, i.e., when the query is fixed. Although some restrictions of the queries [20] and instances [4] have been proposed to lower the complexity, these known tractable cases usually do not apply to combined complexity, i.e., when the query is not fixed. This leaves open the question of which query and instance languages ensure the tractability of probabilistic query evaluation in combined complexity. This paper proposes the first general study of the combined complexity of conjunctive query evaluation on probabilistic instances over binary signatures, which we can alternatively phrase as a probabilistic version of the graph homomor-phism problem, or of a constraint satisfaction problem (CSP) variant. We study the complexity of this problem depending on whether instances and queries can use features such as edge labels, disconnectedness, branching, and edges in both directions. We show that the complexity landscape is surprisingly rich, using a variety of technical tools: automata-based compilation to d-DNNF lineages as in [4], β-acyclic lin-eages using [11], the X-property for tractable CSP from [25], graded DAGs [28] and various coding techniques for hardness proofs
Conjunctive Queries on Probabilistic Graphs: The Limits of Approximability
Query evaluation over probabilistic databases is a notoriously intractable
problem -- not only in combined complexity, but for many natural queries in
data complexity as well. This motivates the study of probabilistic query
evaluation through the lens of approximation algorithms, and particularly of
combined FPRASes, whose runtime is polynomial in both the query and instance
size. In this paper, we focus on tuple-independent probabilistic databases over
binary signatures, which can be equivalently viewed as probabilistic graphs. We
study in which cases we can devise combined FPRASes for probabilistic query
evaluation in this setting.
We settle the complexity of this problem for a variety of query and instance
classes, by proving both approximability and (conditional) inapproximability
results. This allows us to deduce many corollaries of possible independent
interest. For example, we show how the results of Arenas et al. on counting
fixed-length strings accepted by an NFA imply the existence of an FPRAS for the
two-terminal network reliability problem on directed acyclic graphs: this was
an open problem until now. We also show that one cannot extend the recent
result of van Bremen and Meel that gives a combined FPRAS for self-join-free
conjunctive queries of bounded hypertree width on probabilistic databases:
neither the bounded-hypertree-width condition nor the self-join-freeness
hypothesis can be relaxed. Finally, we complement all our inapproximability
results with unconditional lower bounds, showing that DNNF provenance circuits
must have at least moderately exponential size in combined complexity.Comment: 19 pages. Submitte
Connecting Width and Structure in Knowledge Compilation
Several query evaluation tasks can be done via knowledge compilation: the query result is compiled as a lineage circuit from which the answer can be determined. For such tasks, it is important to leverage some width parameters of the circuit, such as bounded treewidth or pathwidth, to convert the circuit to structured classes, e.g., deterministic structured NNFs (d-SDNNFs) or OBDDs. In this work, we show how to connect the width of circuits to the size of their structured representation, through upper and lower bounds. For the upper bound, we show how bounded-treewidth circuits can be converted to a d-SDNNF, in time linear in the circuit size. Our bound, unlike existing results, is constructive and only singly exponential in the treewidth. We show a related lower bound on monotone DNF or CNF formulas, assuming a constant bound on the arity (size of clauses) and degree (number of occurrences of each variable). Specifically, any d-SDNNF (resp., SDNNF) for such a DNF (resp., CNF) must be of exponential size in its treewidth; and the same holds for pathwidth when compiling to OBDDs. Our lower bounds, in contrast with most previous work, apply to any formula of this class, not just a well-chosen family. Hence, for our language of DNF and CNF, pathwidth and treewidth respectively characterize the efficiency of compiling to OBDDs and (d-)SDNNFs, that is, compilation is singly exponential in the width parameter. We conclude by applying our lower bound results to the task of query evaluation
Provenance and Probabilities in Relational Databases: From Theory to Practice
International audienceWe review the basics of data provenance in relational databases. We describe different provenance formalisms, from Boolean provenance to provenance semirings and beyond, that can be used for a wide variety of purposes, to obtain additional information on the output of a query. We discuss representation systems for data provenance, circuits in particular, with a focus on practical implementation. Finally, we explain how provenance is practically used for probabilistic query evaluation in probabilistic databases
Combined Tractability of Query Evaluation via Tree Automata and Cycluits (Extended Version)
69 pages, accepted at ICDT'17. Appendix F contains results from an independent upcoming journal paper by Michael Benedikt, Pierre Bourhis, Georg Gottlob, and Pierre SenellartWe investigate parameterizations of both database instances and queries that make query evaluation fixed-parameter tractable in combined complexity. We introduce a new Datalog fragment with stratified negation, intensional-clique-guarded Datalog (ICG-Datalog), with linear-time evaluation on structures of bounded treewidth for programs of bounded rule size. Such programs capture in particular conjunctive queries with simplicial decompositions of bounded width, guarded negation fragment queries of bounded CQ-rank, or two-way regular path queries. Our result proceeds via compilation to alternating two-way automata, whose semantics is defined via cyclic provenance circuits (cycluits) that can be tractably evaluated. Last, we prove that probabilistic query evaluation remains intractable in combined complexity under this parameterization