4 research outputs found
The Dichotomy of Evaluating Homomorphism-Closed Queries on Probabilistic Graphs
We study the problem of probabilistic query evaluation on probabilistic
graphs, namely, tuple-independent probabilistic databases on signatures of
arity two. Our focus is the class of queries that is closed under
homomorphisms, or equivalently, the infinite unions of conjunctive queries. Our
main result states that all unbounded queries from this class are #P-hard for
probabilistic query evaluation. As bounded queries from this class are
equivalent to a union of conjunctive queries, they are already classified by
the dichotomy of Dalvi and Suciu (2012). Hence, our result and theirs imply a
complete data complexity dichotomy, between polynomial time and #P-hardness,
for evaluating infinite unions of conjunctive queries over probabilistic
graphs. This dichotomy covers in particular all fragments of infinite unions of
conjunctive queries such as negation-free (disjunctive) Datalog, regular path
queries, and a large class of ontology-mediated queries on arity-two
signatures. Our result is shown by reducing from counting the valuations of
positive partitioned 2-DNF formulae for some queries, or from the
source-to-target reliability problem in an undirected graph for other queries,
depending on properties of minimal models. The presented dichotomy result
applies to even a special case of probabilistic query evaluation called
generalized model counting, where fact probabilities must be 0, 0.5, or 1.Comment: 30 pages. Journal version of the ICDT'20 paper
https://drops.dagstuhl.de/opus/volltexte/2020/11939/. Submitted to LMCS. The
previous version (version 2) was the same as the ICDT'20 paper with some
minor formatting tweaks and 7 extra pages of technical appendi
A Dichotomy for Homomorphism-Closed Queries on Probabilistic Graphs
We study the problem of probabilistic query evaluation (PQE) over probabilistic graphs, namely, tuple-independent probabilistic databases (TIDs) on signatures of arity two. Our focus is the class of queries that is closed under homomorphisms, or equivalently, the infinite unions of conjunctive queries, denoted UCQ?. Our main result states that all unbounded queries in UCQ? are #P-hard for PQE. As bounded queries in UCQ? are already classified by the dichotomy of Dalvi and Suciu [Dalvi and Suciu, 2012], our results and theirs imply a complete dichotomy on PQE for UCQ? queries over probabilistic graphs. This dichotomy covers in particular all fragments in UCQ? such as negation-free (disjunctive) Datalog, regular path queries, and a large class of ontology-mediated queries on arity-two signatures. Our result is shown by reducing from counting the valuations of positive partitioned 2-DNF formulae (#PP2DNF) for some queries, or from the source-to-target reliability problem in an undirected graph (#U-ST-CON) for other queries, depending on properties of minimal models
Tuple-Independent Representations of Infinite Probabilistic Databases
Probabilistic databases (PDBs) are probability spaces over database
instances. They provide a framework for handling uncertainty in databases, as
occurs due to data integration, noisy data, data from unreliable sources or
randomized processes. Most of the existing theory literature investigated
finite, tuple-independent PDBs (TI-PDBs) where the occurrences of tuples are
independent events. Only recently, Grohe and Lindner (PODS '19) introduced
independence assumptions for PDBs beyond the finite domain assumption. In the
finite, a major argument for discussing the theoretical properties of TI-PDBs
is that they can be used to represent any finite PDB via views. This is no
longer the case once the number of tuples is countably infinite. In this paper,
we systematically study the representability of infinite PDBs in terms of
TI-PDBs and the related block-independent disjoint PDBs.
The central question is which infinite PDBs are representable as first-order
views over tuple-independent PDBs. We give a necessary condition for the
representability of PDBs and provide a sufficient criterion for
representability in terms of the probability distribution of a PDB. With
various examples, we explore the limits of our criteria. We show that
conditioning on first order properties yields no additional power in terms of
expressivity. Finally, we discuss the relation between purely logical and
arithmetic reasons for (non-)representability
A dichotomy for homomorphism−closed queries on probabilistic graphs
We study the problem of probabilistic query evaluation (PQE) over probabilistic graphs, namely, tuple-independent probabilistic databases (TIDs) on signatures of arity two. Our focus is the class of queries that is closed under homomorphisms, or equivalently, the infinite unions of conjunctive queries, denoted UCQ∞. Our main result states that all unbounded queries in UCQ∞ are #P-hard for PQE. As bounded queries in UCQ∞ are already classified by the dichotomy of Dalvi and Suciu [Dalvi and Suciu, 2012], our results and theirs imply a complete dichotomy on PQE for UCQ∞ queries over probabilistic graphs. This dichotomy covers in particular all fragments in UCQ∞ such as negation-free (disjunctive) Datalog, regular path queries, and a large class of ontology-mediated queries on arity-two signatures. Our result is shown by reducing from counting the valuations of positive partitioned 2-DNF formulae (#PP2DNF) for some queries, or from the source-to-target reliability problem in an undirected graph (#U-ST-CON) for other queries, depending on properties of minimal models