2,965 research outputs found
Optimizing Phylogenetic Supertrees Using Answer Set Programming
The supertree construction problem is about combining several phylogenetic
trees with possibly conflicting information into a single tree that has all the
leaves of the source trees as its leaves and the relationships between the
leaves are as consistent with the source trees as possible. This leads to an
optimization problem that is computationally challenging and typically
heuristic methods, such as matrix representation with parsimony (MRP), are
used. In this paper we consider the use of answer set programming to solve the
supertree construction problem in terms of two alternative encodings. The first
is based on an existing encoding of trees using substructures known as
quartets, while the other novel encoding captures the relationships present in
trees through direct projections. We use these encodings to compute a
genus-level supertree for the family of cats (Felidae). Furthermore, we compare
our results to recent supertrees obtained by the MRP method.Comment: To appear in Theory and Practice of Logic Programming (TPLP),
Proceedings of ICLP 201
Cophenetic metrics for phylogenetic trees, after Sokal and Rohlf
Phylogenetic tree comparison metrics are an important tool in the study of
evolution, and hence the definition of such metrics is an interesting problem
in phylogenetics. In a paper in Taxon fifty years ago, Sokal and Rohlf proposed
to measure quantitatively the difference between a pair of phylogenetic trees
by first encoding them by means of their half-matrices of cophenetic values,
and then comparing these matrices. This idea has been used several times since
then to define dissimilarity measures between phylogenetic trees but, to our
knowledge, no proper metric on weighted phylogenetic trees with nested taxa
based on this idea has been formally defined and studied yet. Actually, the
cophenetic values of pairs of different taxa alone are not enough to single out
phylogenetic trees with weighted arcs or nested taxa. In this paper we define a
family of cophenetic metrics that compare phylogenetic trees on a same set of
taxa by encoding them by means of their vectors of cophenetic values of pairs
of taxa and depths of single taxa, and then computing the norm of the
difference of the corresponding vectors. Then, we study, either analytically or
numerically, some of their basic properties: neighbors, diameter, distribution,
and their rank correlation with each other and with other metrics.Comment: The "authors' cut" of a paper published in BMC Bioinformatics 14:3
(2013). 46 page
Disk Covering Methods Improve Phylogenomic Analyses
Motivation: With the rapid growth rate of newly sequenced genomes, species tree inference from multiple genes has become a basic bioinformatics task in comparative and evolutionary biology. However, accurate species tree estimation is difficult in the presence of gene tree discordance, which is often due to incomplete lineage sorting (ILS), modelled by the multi-species coalescent. Several highly accurate coalescent-based species tree estimation methods have been developed over the last decade, including MP-EST. However, the running time for MP-EST increases rapidly as the number of species grows. Results: We present divide-and-conquer techniques that improve the scalability of MP-EST so that it can run efficiently on large datasets. Surprisingly, this technique also improves the accuracy of species trees estimated by MP-EST, as our study shows on a collection of simulated and biological datasets.NSF DEB 0733029, DBI 1062335Computer Science
- …