research

Invariant versus classical quartet inference when evolution is heterogeneous across sites and lineages

Abstract

One reason why classical phylogenetic reconstruction methods fail to correctly infer the underlying topology is because they assume oversimplified models. In this paper we propose a topology reconstruction method consistent with the most general Markov model of nucleotide substitution, which can also deal with data coming from mixtures on the same topology. It is based on an idea of Eriksson on using phylogenetic invariants and provides a system of weights that can be used as input of quartet-based methods. We study its performance on real data and on a wide range of simulated 4-taxon data (both time-homogeneous and nonhomogeneous, with or without among-site rate heterogeneity, and with different branch length settings). We compare it to the classical methods of neighbor-joining (with paralinear distance), maximum likelihood (with different underlying models), and maximum parsimony. Our results show that this method is accurate and robust, has a similar performance to ML when data satisfies the assumptions of both methods, and outperforms all methods when these are based on inappropriate substitution models or when both long and short branches are present. If alignments are long enough, then it also outperforms other methods when some of its assumptions are violated.Comment: 32 pages; 9 figure

    Similar works