7 research outputs found
The Complexity of Rooted Phylogeny Problems
Several computational problems in phylogenetic reconstruction can be
formulated as restrictions of the following general problem: given a formula in
conjunctive normal form where the literals are rooted triples, is there a
rooted binary tree that satisfies the formula? If the formulas do not contain
disjunctions, the problem becomes the famous rooted triple consistency problem,
which can be solved in polynomial time by an algorithm of Aho, Sagiv,
Szymanski, and Ullman. If the clauses in the formulas are restricted to
disjunctions of negated triples, Ng, Steel, and Wormald showed that the problem
remains NP-complete. We systematically study the computational complexity of
the problem for all such restrictions of the clauses in the input formula. For
certain restricted disjunctions of triples we present an algorithm that has
sub-quadratic running time and is asymptotically as fast as the fastest known
algorithm for the rooted triple consistency problem. We also show that any
restriction of the general rooted phylogeny problem that does not fall into our
tractable class is NP-complete, using known results about the complexity of
Boolean constraint satisfaction problems. Finally, we present a pebble game
argument that shows that the rooted triple consistency problem (and also all
generalizations studied in this paper) cannot be solved by Datalog
The complexity of rooted phylogeny problems
ABSTRACT Several computational problems in phylogenetic reconstruction can be formulated as restrictions of the following general problem: given a formula in conjunctive normal form where the atomic formulas are rooted triples, is there a rooted binary tree that satisfies the formula? If the formulas do not contain disjunctions and negations, the problem becomes the famous rooted triple consistency problem, which can be solved in polynomial time by an algorithm of Aho, Sagiv, Szymanski, and Ullman. If the clauses in the formulas are restricted to disjunctions of negated triples, Ng, Steel, and Wormald showed that the problem remains NP-complete. We systematically study the computational complexity of the problem for all such restrictions of the clauses in the input formula. For certain restricted disjunctions of triples we present an algorithm that has sub-quadratic running time and is asymptotically as fast as the fastest known algorithm for the rooted triple consistency problem. We also show that any restriction of the general rooted phylogeny problem that does not fall into our tractable class is NP-complete, using known results about the complexity of Boolean constraint satisfaction problems. Finally, we present a pebble game argument that shows that the rooted triple consistency problem (and also all generalizations studied in this paper) cannot be solved by Datalog
The Complexity of Surjective Homomorphism Problems -- a Survey
We survey known results about the complexity of surjective homomorphism
problems, studied in the context of related problems in the literature such as
list homomorphism, retraction and compaction. In comparison with these
problems, surjective homomorphism problems seem to be harder to classify and we
examine especially three concrete problems that have arisen from the
literature, two of which remain of open complexity
Phylogenetic CSPs are Approximation Resistant
We study the approximability of a broad class of computational problems --
originally motivated in evolutionary biology and phylogenetic reconstruction --
concerning the aggregation of potentially inconsistent (local) information
about items of interest, and we present optimal hardness of approximation
results under the Unique Games Conjecture. The class of problems studied here
can be described as Constraint Satisfaction Problems (CSPs) over infinite
domains, where instead of values or a fixed-size domain, the
variables can be mapped to any of the leaves of a phylogenetic tree. The
topology of the tree then determines whether a given constraint on the
variables is satisfied or not, and the resulting CSPs are called Phylogenetic
CSPs. Prominent examples of Phylogenetic CSPs with a long history and
applications in various disciplines include: Triplet Reconstruction, Quartet
Reconstruction, Subtree Aggregation (Forbidden or Desired). For example, in
Triplet Reconstruction, we are given triplets of the form
(indicating that ``items are more similar to each other than to '')
and we want to construct a hierarchical clustering on the items, that
respects the constraints as much as possible. Despite more than four decades of
research, the basic question of maximizing the number of satisfied constraints
is not well-understood. The current best approximation is achieved by
outputting a random tree (for triplets, this achieves a 1/3 approximation). Our
main result is that every Phylogenetic CSP is approximation resistant, i.e.,
there is no polynomial-time algorithm that does asymptotically better than a
(biased) random assignment. This is a generalization of the results in
Guruswami, Hastad, Manokaran, Raghavendra, and Charikar (2011), who showed that
ordering CSPs are approximation resistant (e.g., Max Acyclic Subgraph,
Betweenness).Comment: 45 pages, 11 figures, Abstract shortened for arxi