10 research outputs found
Efficiently Calculating Evolutionary Tree Measures Using SAT
We develop techniques to calculate important measures in evolutionary biology by encoding to CNF formulas and using powerful SAT solvers. Comparing evolutionary trees is a necessary step in tree reconstruction algorithms, locating recombination and lateral gene transfer, and in analyzing and visualizing sets of trees. We focus on two popular comparison measures for trees: the hybridization number and the rooted subtree-prune-and-regraft (rSPR) distance. Both have recently been shown to be NP-hard, and effcient algorithms are needed to compute
and approximate these measures. We encode these as a Boolean formula such that two trees have hybridization number k (or rSPR distance k) if and only if the corresponding formula is satisfiable. We use state-of-the-art SAT solvers to determine if the formula encoding the measure has a satisfying assignment. Our encoding also provides a rich
source of real-world SAT instances, and we include a comparison of several recent solvers (minisat, adaptg2wsat, novelty+p, Walksat, March KS and SATzilla).Postprint (author’s final draft
The agreement distance of unrooted phylogenetic networks
A rearrangement operation makes a small graph-theoretical change to a
phylogenetic network to transform it into another one. For unrooted
phylogenetic trees and networks, popular rearrangement operations are tree
bisection and reconnection (TBR) and prune and regraft (PR) (called subtree
prune and regraft (SPR) on trees). Each of these operations induces a metric on
the sets of phylogenetic trees and networks. The TBR-distance between two
unrooted phylogenetic trees and can be characterised by a maximum
agreement forest, that is, a forest with a minimum number of components that
covers both and in a certain way. This characterisation has
facilitated the development of fixed-parameter tractable algorithms and
approximation algorithms. Here, we introduce maximum agreement graphs as a
generalisations of maximum agreement forests for phylogenetic networks. While
the agreement distance -- the metric induced by maximum agreement graphs --
does not characterise the TBR-distance of two networks, we show that it still
provides constant-factor bounds on the TBR-distance. We find similar results
for PR in terms of maximum endpoint agreement graphs.Comment: 23 pages, 13 figures, final journal versio
The agreement distance of rooted phylogenetic networks
The minimal number of rooted subtree prune and regraft (rSPR) operations
needed to transform one phylogenetic tree into another one induces a metric on
phylogenetic trees - the rSPR-distance. The rSPR-distance between two
phylogenetic trees and can be characterised by a maximum agreement
forest; a forest with a minimum number of components that covers both and
. The rSPR operation has recently been generalised to phylogenetic networks
with, among others, the subnetwork prune and regraft (SNPR) operation. Here, we
introduce maximum agreement graphs as an explicit representations of
differences of two phylogenetic networks, thus generalising maximum agreement
forests. We show that maximum agreement graphs induce a metric on phylogenetic
networks - the agreement distance. While this metric does not characterise the
distances induced by SNPR and other generalisations of rSPR, we prove that it
still bounds these distances with constant factors.Comment: 24 pages, 16 figure
Efficiently calculating evolutionary tree measures using SAT
We develop techniques to calculate important measures in evolutionary biology by encoding to CNF formulas and using powerful SAT solvers. Comparing evolutionary trees is a necessary step in tree reconstruction algorithms, locating recombination and lateral gene transfer, and in analyzing and visualizing sets of trees. We focus on two popular comparison measures for trees: the hybridization number and the rooted subtree-prune-and-regraft (rSPR) distance. Both have recently been shown to be NP-hard, and effcient algorithms are needed to compute
and approximate these measures. We encode these as a Boolean formula such that two trees have hybridization number k (or rSPR distance k) if and only if the corresponding formula is satisfiable. We use state-of-the-art SAT solvers to determine if the formula encoding the measure has a satisfying assignment. Our encoding also provides a rich
source of real-world SAT instances, and we include a comparison of several recent solvers (minisat, adaptg2wsat, novelty+p, Walksat, March KS and SATzilla)
Efficiently calculating evolutionary tree measures using SAT
We develop techniques to calculate important measures in evolutionary biology by encoding to CNF formulas and using powerful SAT solvers. Comparing evolutionary trees is a necessary step in tree reconstruction algorithms, locating recombination and lateral gene transfer, and in analyzing and visualizing sets of trees. We focus on two popular comparison measures for trees: the hybridization number and the rooted subtree-prune-and-regraft (rSPR) distance. Both have recently been shown to be NP-hard, and effcient algorithms are needed to compute
and approximate these measures. We encode these as a Boolean formula such that two trees have hybridization number k (or rSPR distance k) if and only if the corresponding formula is satisfiable. We use state-of-the-art SAT solvers to determine if the formula encoding the measure has a satisfying assignment. Our encoding also provides a rich
source of real-world SAT instances, and we include a comparison of several recent solvers (minisat, adaptg2wsat, novelty+p, Walksat, March KS and SATzilla)