Search CORE

96 research outputs found

Optimal Completion and Comparison of Incomplete Phylogenetic Trees Under Robinson-Foulds Distance

Author: Bansal Mukul S.
Yao Keegan
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)
Publication date: 01/01/2021
Field of study

Dagstuhl Research Online Publication Server

Synthesizing species trees from gene trees using the parameterized and graph-theoretic approaches

Author: Moon Ju Cheol
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2017
Field of study

Gene trees describe how parts of the species have evolved over time, and it is assumed that gene trees have evolved along the branches of the species tree. However, some of gene trees are often discordant with the corresponding species tree due to the complicated evolution history of genes. To overcome this obstacle, median problems have emerged as a major tool for synthesizing species trees by reconciling discordance in a given collection of gene trees. Given a collection of gene trees and a cost function, the median problem seeks a tree, called median tree, that minimizes the overall cost to the gene trees. Median tree problems are typically NP-hard, and there is an increased interest in making such median tree problems available for large-scale species tree construction. In this thesis work, we first show that the gene duplication median tree problem satisfied the weaker version of the Pareto property and propose a parameterized algorithm to solve the gene duplication median tree problem. Second, we design two efficient methods to handle the issues of applying the parameterized algorithm to unrooted gene trees which are sampled from the different species. Third, we introduce the graph-theoretic formulation of the Robinson-Foulds median tree problem and a new tree edit operation. Fourth, we propose a new metric between two phylogenetic trees and examine the statistical properties of the metric. Finally, we propose a new clustering criteria in a bipartite network and propose a new NP-hard problem and its ILP formulation

Digital Repository @ Iowa State University (ISU)

Constructing majority-rule supertrees

Author: A Purvis
AD Gordon
BR Baum
C Semple
CG Sibley
D Bryant
D Gusfield
D Gusfield
D Gusfield
D Pisani
David Fernández-Baca
DF Robinson
DG Brown
E Danna
EN Adams
F Delsuc
FR McMorris
G Sierksma
GB Nunn
J Dong
JA Cotton
JA Cotton
Jianrong Dong
JP Barthélemy
M Kennedy
M Wilkinson
M Wilkinson
MA Ragan
MA Steel
MdL Brooke
N Amenta
ND Pattengale
ORP Bininda-Emonds
P Goloboff
PA Goloboff
S Sridhar
T Margush
V Ranwez
W Day
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Supertree methods combine the phylogenetic information from multiple partially-overlapping trees into a larger phylogenetic tree called a supertree. Several supertree construction methods have been proposed to date, but most of these are not designed with any specific properties in mind. Recently, Cotton and Wilkinson proposed extensions of the majority-rule consensus tree method to the supertree setting that inherit many of the appealing properties of the former. Results We study a variant of one of Cotton and Wilkinson's methods, called majority-rule (+) supertrees. After proving that a key underlying problem for constructing majority-rule (+) supertrees is NP-hard, we develop a polynomial-size exact integer linear programming formulation of the problem. We then present a data reduction heuristic that identifies smaller subproblems that can be solved independently. While this technique is not guaranteed to produce optimal solutions, it can achieve substantial problem-size reduction. Finally, we report on a computational study of our approach on various real data sets, including the 121-taxon, 7-tree Seabirds data set of Kennedy and Page. Conclusions The results indicate that our exact method is computationally feasible for moderately large inputs. For larger inputs, our data reduction heuristic makes it feasible to tackle problems that are well beyond the range of the basic integer programming approach. Comparisons between the results obtained by our heuristic and exact solutions indicate that the heuristic produces good answers. Our results also suggest that the majority-rule (+) approach, in both its basic form and with data reduction, yields biologically meaningful phylogenies.</p

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

MUL-Tree Pruning for Consistency and Compatibility

Author: Hampson Christopher
Harvey Daniel J.
Iliopoulos Costas S.
Jansson Jesper
Lim Zara
Sung Wing-Kin
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023)
Publication date: 01/01/2023
Field of study

A multi-labelled tree (or MUL-tree) is a rooted tree leaf-labelled by a set of labels, where each label may appear more than once in the tree. We consider the MUL-tree Set Pruning for Consistency problem (MULSETPC), which takes as input a set of MUL-trees and asks whether there exists a perfect pruning of each MUL-tree that results in a consistent set of single-labelled trees. MULSETPC was proven to be NP-complete by Gascon et al. when the MUL-trees are binary, each leaf label is used at most three times, and the number of MUL-trees is unbounded. To determine the computational complexity of the problem when the number of MUL-trees is constant was left as an open problem. Here, we resolve this question by proving a much stronger result, namely that MULSETPC is NP-complete even when there are only two MUL-trees, every leaf label is used at most twice, and every MUL-tree is either binary or has constant height. Furthermore, we introduce an extension of MULSETPC that we call MULSETPComp, which replaces the notion of consistency with compatibility, and prove that MULSETPComp is NP-complete even when there are only two MUL-trees, every leaf label is used at most thrice, and every MUL-tree has constant height. Finally, we present a polynomial-time algorithm for instances of MULSETPC with a constant number of binary MUL-trees, in the special case where every leaf label occurs exactly once in at least one MUL-tree

Dagstuhl Research Online Publication Server

Does the choice of nucleotide substitution models matter topologically?

Author: Darriba Diego
Hoff Michael
Orf Stefan
Riehm Benedikt
Stamatakis Alexandros
Publication venue: BioMed Central
Publication date: 01/01/2016
Field of study

Background: In the context of a master level programming practical at the computer science department of the Karlsruhe Institute of Technology, we developed and make available an open-source code for testing all 203 possible nucleotide substitution models in the Maximum Likelihood (ML) setting under the common Akaike, corrected Akaike, and Bayesian information criteria. We address the question if model selection matters topologically, that is, if conducting ML inferences under the optimal, instead of a standard General Time Reversible model, yields different tree topologies. We also assess, to which degree models selected and trees inferred under the three standard criteria (AIC, AICc, BIC) differ. Finally, we assess if the definition of the sample size (#sites versus #sites × #taxa) yields different models and, as a consequence, different tree topologies. Results: We find that, all three factors (by order of impact: nucleotide model selection, information criterion used, sample size definition) can yield topologically substantially different final tree topologies (topological difference exceeding 10 %) for approximately 5 % of the tree inferences conducted on the 39 empirical datasets used in our study. Conclusions: We find that, using the best-fit nucleotide substitution model may change the final ML tree topology compared to an inference under a default GTR model. The effect is less pronounced when comparing distinct information criteria. Nonetheless, in some cases we did obtain substantial topological differences

Repositorio da Universidade da Coruña

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

KITopen

Springer - Publisher Connector

PubMed Central

Does the choice of nucleotide substitution models matter topologically?

Author: A Stamatakis
Alexandros Stamatakis
Benedikt Riehm
C Lakner
D Darriba
D Posada
D Robinson
Diego Darriba
GW Grimm
J Felsenstein
J Ripplinger
JP Huelsenbeck
JP Huelsenbeck
M Hasegawa
Michael Hoff
S Guindon
S Tavaré
Stefan Orf
T Flouri
TH Jukes
W Fletcher
Z Yang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref