10 research outputs found

    A Negative Conjunctive Query is Easy if and only if it is Beta-Acyclic

    Get PDF
    It is known that the data complexity of a Conjunctive Query (CQ) is determined only by the way its variables are shared between atoms, reflected by its hypergraph. In particular, Yannakakis [18, 3] proved that a CQ is decidable in linear time when it is α-acyclic, i.e. its hypergraph is α-acyclic; Bagan et al. [2] even state: Any CQ is decidable in linear time iff it is α-acyclic. (under certain hypotheses) (By linear time, we mean a query on a structure S can be decided in time O(|S|)) A natural question is: since the complexity of a Negative Conjunctive Query (NCQ), a conjunctive query where all atoms are negated, also only depends on its hypergraph, can we find a similar dichotomy in this case? To answer this question, we revisit a result of Ordyniak et al. [17] — that states that satisfiability of a β-acyclic CNF formula is decidable in polynomial time — by proving that some part of their procedure can be done in linear time. This implies, under an algorithmic hypothesis (precisely: one cannot decide whether a graph is triangle-free in time O(n 2 log n) where n is the number of vertices.) that is likely true: Any NCQ is decidable in quasi-linear time iff it is β-acyclic. (By quasi-linear time, we mean a query on a structure S can be decided in time O(|S | log |S|)) We extend the easiness result to Signed Conjunctive Query (SCQ) where some atoms are negated. This has great interest since using some negated atoms is natural in the frameworks of databases and CSP. Furthermore, it implies straightforwardly the following: Any β-acyclic existential first-order query is decidable in quasi-linear time

    Hypergraph Acyclicity and Propositional Model Counting

    Full text link
    We show that the propositional model counting problem #SAT for CNF- formulas with hypergraphs that allow a disjoint branches decomposition can be solved in polynomial time. We show that this class of hypergraphs is incomparable to hypergraphs of bounded incidence cliquewidth which were the biggest class of hypergraphs for which #SAT was known to be solvable in polynomial time so far. Furthermore, we present a polynomial time algorithm that computes a disjoint branches decomposition of a given hypergraph if it exists and rejects otherwise. Finally, we show that some slight extensions of the class of hypergraphs with disjoint branches decompositions lead to intractable #SAT, leaving open how to generalize the counting result of this paper

    Compressed Representations of Conjunctive Query Results

    Full text link
    Relational queries, and in particular join queries, often generate large output results when executed over a huge dataset. In such cases, it is often infeasible to store the whole materialized output if we plan to reuse it further down a data processing pipeline. Motivated by this problem, we study the construction of space-efficient compressed representations of the output of conjunctive queries, with the goal of supporting the efficient access of the intermediate compressed result for a given access pattern. In particular, we initiate the study of an important tradeoff: minimizing the space necessary to store the compressed result, versus minimizing the answer time and delay for an access request over the result. Our main contribution is a novel parameterized data structure, which can be tuned to trade off space for answer time. The tradeoff allows us to control the space requirement of the data structure precisely, and depends both on the structure of the query and the access pattern. We show how we can use the data structure in conjunction with query decomposition techniques, in order to efficiently represent the outputs for several classes of conjunctive queries.Comment: To appear in PODS'18; 35 pages; comments welcom

    Enumerating Answers to First-Order Queries over Databases of Low Degree

    Get PDF
    A class of relational databases has low degree if for all δ>0\delta>0, all but finitely many databases in the class have degree at most nδn^{\delta}, where nn is the size of the database. Typical examples are databases of bounded degree or of degree bounded by logn\log n. It is known that over a class of databases having low degree, first-order boolean queries can be checked in pseudo-linear time, i.e.\ for all ϵ>0\epsilon>0 in time bounded by n1+ϵn^{1+\epsilon}. We generalize this result by considering query evaluation. We show that counting the number of answers to a query can be done in pseudo-linear time and that after a pseudo-linear time preprocessing we can test in constant time whether a given tuple is a solution to a query or enumerate the answers to a query with constant delay

    Beyond Worst-Case Analysis for Joins with Minesweeper

    Full text link
    We describe a new algorithm, Minesweeper, that is able to satisfy stronger runtime guarantees than previous join algorithms (colloquially, `beyond worst-case guarantees') for data in indexed search trees. Our first contribution is developing a framework to measure this stronger notion of complexity, which we call {\it certificate complexity}, that extends notions of Barbay et al. and Demaine et al.; a certificate is a set of propositional formulae that certifies that the output is correct. This notion captures a natural class of join algorithms. In addition, the certificate allows us to define a strictly stronger notion of runtime complexity than traditional worst-case guarantees. Our second contribution is to develop a dichotomy theorem for the certificate-based notion of complexity. Roughly, we show that Minesweeper evaluates β\beta-acyclic queries in time linear in the certificate plus the output size, while for any β\beta-cyclic query there is some instance that takes superlinear time in the certificate (and for which the output is no larger than the certificate size). We also extend our certificate-complexity analysis to queries with bounded treewidth and the triangle query.Comment: [This is the full version of our PODS'2014 paper.

    Trade-offs in Static and Dynamic Evaluation of Hierarchical Queries

    Get PDF
    We investigate trade-offs in static and dynamic evaluation of hierarchical queries with arbitrary free variables. In the static setting, the trade-off is between the time to partially compute the query result and the delay needed to enumerate its tuples. In the dynamic setting, we additionally consider the time needed to update the query result in the presence of single-tuple inserts and deletes to the input database. Our approach observes the degree of values in the database and uses different computation and maintenance strategies for high-degree and low-degree values. For the latter it partially computes the result, while for the former it computes enough information to allow for on-the-fly enumeration. The main result of this work defines the preprocessing time, the update time, and the enumeration delay as functions of the light/heavy threshold and of the factorization width of the hierarchical query. By conveniently choosing this threshold, our approach can recover a number of prior results when restricted to hierarchical queries. For a restricted class of hierarchical queries, our approach can achieve worst-case optimal update time and enumeration delay conditioned on the Online Matrix-Vector Multiplication Conjecture.Comment: Technical Report; 52 pages. The updated version contains: new diagrams and plots summarizing known results and putting the results of the paper into context; introduction of delta_i-hieararchical queries, for any non-negative integer i; optimality results for delta_0- and delta_1-hieararchical querie

    Trade-offs in Static and Dynamic Evaluation of Hierarchical Queries

    Get PDF
    We investigate trade-offs in static and dynamic evaluation of hierarchical queries with arbitrary free variables. In the static setting, the trade-off is between the time to partially compute the query result and the delay needed to enumerate its tuples. In the dynamic setting, we additionally consider the time needed to update the query result under single-tuple inserts or deletes to the database. Our approach observes the degree of values in the database and uses different computation and maintenance strategies for high-degree (heavy) and low-degree (light) values. For the latter it partially computes the result, while for the former it computes enough information to allow for on-the-fly enumeration. We define the preprocessing time, the update time, and the enumeration delay as functions of the light/heavy threshold. By appropriately choosing this threshold, our approach recovers a number of prior results when restricted to hierarchical queries. We show that for a restricted class of hierarchical queries, our approach achieves worst-case optimal update time and enumeration delay conditioned on the Online Matrix-Vector Multiplication Conjecture
    corecore