47 research outputs found
The Complexity of Rooted Phylogeny Problems
Several computational problems in phylogenetic reconstruction can be
formulated as restrictions of the following general problem: given a formula in
conjunctive normal form where the literals are rooted triples, is there a
rooted binary tree that satisfies the formula? If the formulas do not contain
disjunctions, the problem becomes the famous rooted triple consistency problem,
which can be solved in polynomial time by an algorithm of Aho, Sagiv,
Szymanski, and Ullman. If the clauses in the formulas are restricted to
disjunctions of negated triples, Ng, Steel, and Wormald showed that the problem
remains NP-complete. We systematically study the computational complexity of
the problem for all such restrictions of the clauses in the input formula. For
certain restricted disjunctions of triples we present an algorithm that has
sub-quadratic running time and is asymptotically as fast as the fastest known
algorithm for the rooted triple consistency problem. We also show that any
restriction of the general rooted phylogeny problem that does not fall into our
tractable class is NP-complete, using known results about the complexity of
Boolean constraint satisfaction problems. Finally, we present a pebble game
argument that shows that the rooted triple consistency problem (and also all
generalizations studied in this paper) cannot be solved by Datalog
Datalog and Constraint Satisfaction with Infinite Templates
On finite structures, there is a well-known connection between the expressive
power of Datalog, finite variable logics, the existential pebble game, and
bounded hypertree duality. We study this connection for infinite structures.
This has applications for constraint satisfaction with infinite templates. If
the template Gamma is omega-categorical, we present various equivalent
characterizations of those Gamma such that the constraint satisfaction problem
(CSP) for Gamma can be solved by a Datalog program. We also show that
CSP(Gamma) can be solved in polynomial time for arbitrary omega-categorical
structures Gamma if the input is restricted to instances of bounded treewidth.
Finally, we characterize those omega-categorical templates whose CSP has
Datalog width 1, and those whose CSP has strict Datalog width k.Comment: 28 pages. This is an extended long version of a conference paper that
appeared at STACS'06. In the third version in the arxiv we have revised the
presentation again and added a section that relates our results to
formalizations of CSPs using relation algebra
Linear Datalog and Bounded Path Duality of Relational Structures
In this paper we systematically investigate the connections between logics
with a finite number of variables, structures of bounded pathwidth, and linear
Datalog Programs. We prove that, in the context of Constraint Satisfaction
Problems, all these concepts correspond to different mathematical embodiments
of a unique robust notion that we call bounded path duality. We also study the
computational complexity implications of the notion of bounded path duality. We
show that every constraint satisfaction problem \csp(\best) with bounded path
duality is solvable in NL and that this notion explains in a uniform way all
families of CSPs known to be in NL. Finally, we use the results developed in
the paper to identify new problems in NL
Inherent Complexity of Recursive Queries
AbstractWe give lower bounds on the complexity of certain Datalog queries. Our notion of complexity applies to compile-time optimization techniques for Datalog; thus, our results indicate limitations of these techniques. The main new tool is linear first-order formulas, whose depth (respectively, number of variables) matches the sequential (respectively, parallel) complexity of Datalog programs. We define a combinatorial game (a variant of Ehrenfeucht–Fraı̈ssé games) that can be used to prove nonexpressibility by linear formulas. We thus obtain lower bounds for the sequential and parallel complexity of Datalog queries. We prove syntactically tight versions of our results, by exploiting uniformity and invariance properties of Datalog queries
Datalog-Expressibility for Monadic and Guarded Second-Order Logic
We characterise the sentences in Monadic Second-order Logic (MSO) that are over finite structures equivalent to a Datalog program, in terms of an existential pebble game. We also show that for every class C of finite structures that can be expressed in MSO and is closed under homomorphisms, and for all ?,k ?there exists a canonical Datalog program ? of width (?,k), that is, a Datalog program of width (?,k) which is sound for C (i.e., ? only derives the goal predicate on a finite structure ? if ? ? C) and with the property that ? derives the goal predicate whenever some Datalog program of width (?,k) which is sound for C derives the goal predicate. The same characterisations also hold for Guarded Second-order Logic (GSO), which properly extends MSO. To prove our results, we show that every class C in GSO whose complement is closed under homomorphisms is a finite union of constraint satisfaction problems (CSPs) of ?-categorical structures
On the speed of constraint propagation and the time complexity of arc consistency testing
Establishing arc consistency on two relational structures is one of the most
popular heuristics for the constraint satisfaction problem. We aim at
determining the time complexity of arc consistency testing. The input
structures and can be supposed to be connected colored graphs, as the
general problem reduces to this particular case. We first observe the upper
bound , which implies the bound in terms of
the number of edges and the bound in terms of the number of
vertices. We then show that both bounds are tight up to a constant factor as
long as an arc consistency algorithm is based on constraint propagation (like
any algorithm currently known).
Our argument for the lower bounds is based on examples of slow constraint
propagation. We measure the speed of constraint propagation observed on a pair
by the size of a proof, in a natural combinatorial proof system, that
Spoiler wins the existential 2-pebble game on . The proof size is bounded
from below by the game length , and a crucial ingredient of our
analysis is the existence of with . We find one
such example among old benchmark instances for the arc consistency problem and
also suggest a new, different construction.Comment: 19 pages, 5 figure
The complexity of rooted phylogeny problems
ABSTRACT Several computational problems in phylogenetic reconstruction can be formulated as restrictions of the following general problem: given a formula in conjunctive normal form where the atomic formulas are rooted triples, is there a rooted binary tree that satisfies the formula? If the formulas do not contain disjunctions and negations, the problem becomes the famous rooted triple consistency problem, which can be solved in polynomial time by an algorithm of Aho, Sagiv, Szymanski, and Ullman. If the clauses in the formulas are restricted to disjunctions of negated triples, Ng, Steel, and Wormald showed that the problem remains NP-complete. We systematically study the computational complexity of the problem for all such restrictions of the clauses in the input formula. For certain restricted disjunctions of triples we present an algorithm that has sub-quadratic running time and is asymptotically as fast as the fastest known algorithm for the rooted triple consistency problem. We also show that any restriction of the general rooted phylogeny problem that does not fall into our tractable class is NP-complete, using known results about the complexity of Boolean constraint satisfaction problems. Finally, we present a pebble game argument that shows that the rooted triple consistency problem (and also all generalizations studied in this paper) cannot be solved by Datalog
Complete Axiomatizations of Fragments of Monadic Second-Order Logic on Finite Trees
We consider a specific class of tree structures that can represent basic
structures in linguistics and computer science such as XML documents, parse
trees, and treebanks, namely, finite node-labeled sibling-ordered trees. We
present axiomatizations of the monadic second-order logic (MSO), monadic
transitive closure logic (FO(TC1)) and monadic least fixed-point logic
(FO(LFP1)) theories of this class of structures. These logics can express
important properties such as reachability. Using model-theoretic techniques, we
show by a uniform argument that these axiomatizations are complete, i.e., each
formula that is valid on all finite trees is provable using our axioms. As a
backdrop to our positive results, on arbitrary structures, the logics that we
study are known to be non-recursively axiomatizable
Enhancing Fixed Point Logic with Cardinality Quantifiers
Let Q IPP be any quantifier such that FO(QIFP), first-order logic enhanced with Q IPP and its vectorizations, equals inductive fixed point logic, IFP in expressive power. It is known that for certain quantifiers Q, the equivalence FO(QIFP) ≡ IFP is no longer true if Q is added on both sides. Rather, we have FO (QIFP, Q) < IFP(Q) in such cases. We extend these results to a great variety of quantifiers, namely all unbounded simple cardinality quantifiers. Our argument also applies to partial fixed point logic, PFP. In order to establish an analogous result for least fixed point logic, LFP, we exhibit a general method to pass from arbitrary quantifiers to monotone quantifiers. Our proof shows that the three isomorphism problem is not definable in, infinitary logic extended with all monadic quantifiers and their vectorizations, where a finite bound is imposed to the number of variables as well as to the number of nested quantifiers in Q1. This strengthens a result of Etessami and Immerman by which tree isomorphism is not definable in TC + COUNTIN