10 research outputs found

    Efficiently Calculating Evolutionary Tree Measures Using SAT

    Get PDF
    We develop techniques to calculate important measures in evolutionary biology by encoding to CNF formulas and using powerful SAT solvers. Comparing evolutionary trees is a necessary step in tree reconstruction algorithms, locating recombination and lateral gene transfer, and in analyzing and visualizing sets of trees. We focus on two popular comparison measures for trees: the hybridization number and the rooted subtree-prune-and-regraft (rSPR) distance. Both have recently been shown to be NP-hard, and effcient algorithms are needed to compute and approximate these measures. We encode these as a Boolean formula such that two trees have hybridization number k (or rSPR distance k) if and only if the corresponding formula is satisfiable. We use state-of-the-art SAT solvers to determine if the formula encoding the measure has a satisfying assignment. Our encoding also provides a rich source of real-world SAT instances, and we include a comparison of several recent solvers (minisat, adaptg2wsat, novelty+p, Walksat, March KS and SATzilla).Postprint (author’s final draft

    The agreement distance of unrooted phylogenetic networks

    Full text link
    A rearrangement operation makes a small graph-theoretical change to a phylogenetic network to transform it into another one. For unrooted phylogenetic trees and networks, popular rearrangement operations are tree bisection and reconnection (TBR) and prune and regraft (PR) (called subtree prune and regraft (SPR) on trees). Each of these operations induces a metric on the sets of phylogenetic trees and networks. The TBR-distance between two unrooted phylogenetic trees TT and T′T' can be characterised by a maximum agreement forest, that is, a forest with a minimum number of components that covers both TT and T′T' in a certain way. This characterisation has facilitated the development of fixed-parameter tractable algorithms and approximation algorithms. Here, we introduce maximum agreement graphs as a generalisations of maximum agreement forests for phylogenetic networks. While the agreement distance -- the metric induced by maximum agreement graphs -- does not characterise the TBR-distance of two networks, we show that it still provides constant-factor bounds on the TBR-distance. We find similar results for PR in terms of maximum endpoint agreement graphs.Comment: 23 pages, 13 figures, final journal versio

    Theory and Applications of Satisfiability Testing - SAT 2009

    Full text link

    The agreement distance of rooted phylogenetic networks

    Full text link
    The minimal number of rooted subtree prune and regraft (rSPR) operations needed to transform one phylogenetic tree into another one induces a metric on phylogenetic trees - the rSPR-distance. The rSPR-distance between two phylogenetic trees TT and T′T' can be characterised by a maximum agreement forest; a forest with a minimum number of components that covers both TT and T′T'. The rSPR operation has recently been generalised to phylogenetic networks with, among others, the subnetwork prune and regraft (SNPR) operation. Here, we introduce maximum agreement graphs as an explicit representations of differences of two phylogenetic networks, thus generalising maximum agreement forests. We show that maximum agreement graphs induce a metric on phylogenetic networks - the agreement distance. While this metric does not characterise the distances induced by SNPR and other generalisations of rSPR, we prove that it still bounds these distances with constant factors.Comment: 24 pages, 16 figure

    Efficiently calculating evolutionary tree measures using SAT

    Get PDF
    We develop techniques to calculate important measures in evolutionary biology by encoding to CNF formulas and using powerful SAT solvers. Comparing evolutionary trees is a necessary step in tree reconstruction algorithms, locating recombination and lateral gene transfer, and in analyzing and visualizing sets of trees. We focus on two popular comparison measures for trees: the hybridization number and the rooted subtree-prune-and-regraft (rSPR) distance. Both have recently been shown to be NP-hard, and effcient algorithms are needed to compute and approximate these measures. We encode these as a Boolean formula such that two trees have hybridization number k (or rSPR distance k) if and only if the corresponding formula is satisfiable. We use state-of-the-art SAT solvers to determine if the formula encoding the measure has a satisfying assignment. Our encoding also provides a rich source of real-world SAT instances, and we include a comparison of several recent solvers (minisat, adaptg2wsat, novelty+p, Walksat, March KS and SATzilla)

    Efficiently calculating evolutionary tree measures using SAT

    No full text
    We develop techniques to calculate important measures in evolutionary biology by encoding to CNF formulas and using powerful SAT solvers. Comparing evolutionary trees is a necessary step in tree reconstruction algorithms, locating recombination and lateral gene transfer, and in analyzing and visualizing sets of trees. We focus on two popular comparison measures for trees: the hybridization number and the rooted subtree-prune-and-regraft (rSPR) distance. Both have recently been shown to be NP-hard, and effcient algorithms are needed to compute and approximate these measures. We encode these as a Boolean formula such that two trees have hybridization number k (or rSPR distance k) if and only if the corresponding formula is satisfiable. We use state-of-the-art SAT solvers to determine if the formula encoding the measure has a satisfying assignment. Our encoding also provides a rich source of real-world SAT instances, and we include a comparison of several recent solvers (minisat, adaptg2wsat, novelty+p, Walksat, March KS and SATzilla)
    corecore