Skip to main content
Article thumbnail
Location of Repository

Assessing the Applicability of the GTR Nucleotide Substitution Model Through Simulations

By Laurent Gatto, Daniele Catanzaro and Michel C. Milinkovitch


The General Time Reversible (GTR) model of nucleotide substitution is at the core of many distance-based and character-based phylogeny inference methods. The procedure described by Waddell and Steel (1997), for estimating distances and instantaneous substitution rate matrices, R, under the GTR model, is known to be inapplicable under some conditions, ie, it leads to the inapplicability of the GTR model. Here, we simulate the evolution of DNA sequences along 12 trees characterized by different combinations of tree length, (non-)homogeneity of the substitution rate matrix R, and sequence length. We then evaluate both the frequency of the GTR model inapplicability for estimating distances and the accuracy of inferred alignments. Our results indicate that, inapplicability of the Waddel and Steel’s procedure can be considered a real practical issue, and illustrate that the probability of this inapplicability is a function of substitution rates and sequence length

Topics: Original Research
Publisher: Libertas Academica
OAI identifier:
Provided by: PubMed Central
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://www.pubmedcentral.nih.g... (external link)
  • Suggested articles


    1. (2005). A benchmark of multiple sequence alignment programs upon structural RNAs.,
    2. (1999). A comprehensive comparison of multiple sequence alignment programs.,
    3. (2003). A hidden Markov model for progressive multiple alignment.,
    4. (1974). A new look at the statistical model identifcation,
    5. (1984). A new method for calculating evolutionary substitution rates.,
    6. (2006). A non-linear optimization procedure to estimate distances and instantaneous substitution rate matrices under the GTR model.,
    7. (1996). Approximate methods for estimating the pattern of nucleotide substitution and the variation of substitution rates among sites.,
    8. (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specifi c gap penalties and weight matrix choice.,
    9. (1994). Detecting substitution-rate heterogeneity among regions of a nucleotide sequence.,
    10. (1968). Evolutionary rate at the molecular level.,
    11. (1997). General time-reversible distances with unequal rates across sites: mixing gamma and inverse Gaussian distributions with invariant sites.,
    12. (2004). Inferring Phylogenies, Sinauer Associates, Inc.
    13. (1969). Mammalian Protein Metabolism,
    14. (2001). Maximum-likelihood phylogenetic analysis under a covarion-like model.,
    15. (2003). MRBAYES3: Bayesian phylogenetic inference under mixed models.,
    16. (1999). Nonlinear programming, 2nd edn, Athena Scientifi c.
    17. (2002). Numerical recipes in C++,
    18. (2003). Patterns of nucleotide substitution, insertion and deletion in the human genome inferred from pseudogenes.,
    19. (2003). PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods). Version 4., Sinauer Associates,
    20. (1994). Recovering evolutionary trees under a more realistic model of sequence evolution,
    21. (2001). Soap, cleaning multiple alignments from unstable blocks.,
    22. (2003). The ant colony optimization metaheuristic: Algorithms, applications and advances,
    23. (1990). The general stochastic model of nucleotide substitution,
    24. (2000). The performance of several multiple-sequence alignment programs in relation to secondary-structure features for an rRNA sequence.,

    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.