Query evaluation in tuple-independent probabilistic databases is the problem
of computing the probability of an answer to a query given independent
probabilities of the individual tuples in a database instance. There are two
main approaches to this problem: (1) in `grounded inference' one first obtains
the lineage for the query and database instance as a Boolean formula, then
performs weighted model counting on the lineage (i.e., computes the probability
of the lineage given probabilities of its independent Boolean variables); (2)
in methods known as `lifted inference' or `extensional query evaluation', one
exploits the high-level structure of the query as a first-order formula.
Although it is widely believed that lifted inference is strictly more powerful
than grounded inference on the lineage alone, no formal separation has
previously been shown for query evaluation. In this paper we show such a formal
separation for the first time.
We exhibit a class of queries for which model counting can be done in
polynomial time using extensional query evaluation, whereas the algorithms used
in state-of-the-art exact model counters on their lineages provably require
exponential time. Our lower bounds on the running times of these exact model
counters follow from new exponential size lower bounds on the kinds of d-DNNF
representations of the lineages that these model counters (either explicitly or
implicitly) produce. Though some of these queries have been studied before, no
non-trivial lower bounds on the sizes of these representations for these
queries were previously known.Comment: To appear in International Conference on Database Theory (ICDT) 201