213 research outputs found

    A generalized mechanistic codon model.

    Get PDF
    Models of codon evolution have attracted particular interest because of their unique capabilities to detect selection forces and their high fit when applied to sequence evolution. We described here a novel approach for modeling codon evolution, which is based on Kronecker product of matrices. The 61 × 61 codon substitution rate matrix is created using Kronecker product of three 4 × 4 nucleotide substitution matrices, the equilibrium frequency of codons, and the selection rate parameter. The entities of the nucleotide substitution matrices and selection rate are considered as parameters of the model, which are optimized by maximum likelihood. Our fully mechanistic model allows the instantaneous substitution matrix between codons to be fully estimated with only 19 parameters instead of 3,721, by using the biological interdependence existing between positions within codons. We illustrate the properties of our models using computer simulations and assessed its relevance by comparing the AICc measures of our model and other models of codon evolution on simulations and a large range of empirical data sets. We show that our model fits most biological data better compared with the current codon models. Furthermore, the parameters in our model can be interpreted in a similar way as the exchangeability rates found in empirical codon models

    Accelerating Bayesian inference for evolutionary biology models.

    Get PDF
    Bayesian inference is widely used nowadays and relies largely on Markov chain Monte Carlo (MCMC) methods. Evolutionary biology has greatly benefited from the developments of MCMC methods, but the design of more complex and realistic models and the ever growing availability of novel data is pushing the limits of the current use of these methods. We present a parallel Metropolis-Hastings (M-H) framework built with a novel combination of enhancements aimed towards parameter-rich and complex models. We show on a parameter-rich macroevolutionary model increases of the sampling speed up to 35 times with 32 processors when compared to a sequential M-H process. More importantly, our framework achieves up to a twentyfold faster convergence to estimate the posterior probability of phylogenetic trees using 32 processors when compared to the well-known software MrBayes for Bayesian inference of phylogenetic trees. https://bitbucket.org/XavMeyer/hogan. [email protected]. Supplementary data are available at Bioinformatics online

    No evidence for the radiation time lag model after whole genome duplications in Teleostei.

    Get PDF
    The short and long term effects of polyploidization on the evolutionary fate of lineages is still unclear despite much interest. First recognized in land plants, it has become clear that polyploidization is widespread in eukaryotes, notably at the origin of vertebrates and teleost fishes. Many hypotheses have been proposed to link the species richness of lineages and whole genome duplications. For instance, the radiation time lag model suggests that paleopolyploidy would favour the apparition of new phenotypic traits, although the radiation of the lineage would not occur before a later dispersion event. Some results indicate that this model may be observed during land plant evolution. In this work, we test predictions of the radiation time lag model using both fossil data and molecular phylogenies in ancient and more recent teleost whole genome duplications. We fail to find any evidence of delayed increase of the species number after any of these events and conclude that paleopolyploidization still remains to be unambiguously linked to taxonomic diversity in teleosts

    Bayesian estimation of multiple clade competition from fossil data

    Get PDF
    Background: The diversification dynamics of clades is governed by speciation and extinction processes and is likely affected by multiple biotic, abiotic, and stochastic factors. Using quantitative methods to analyse fossil occurrence data, one may infer rates of speciation and extinction in a Bayesian framework. Moreover, Silvestro et al. (2015a) recently developed a Multiple Clade Diversity Dependence birth-death model (MCDD) to determine whether diversification dynamics can be explained by positive or negative effects of interactions within or between co-existing clades. However, the power and accuracy of this model and its general applicability have yet to be thoroughly investigated. Aims: Explore the properties of the existing MCDD implementation, which is based on Bayesian variable selection. Introduce an alternative parameterization based on the Horseshoe prior and show the properties of this approach for Bayesian shrinkage in complex models. Test the ability of the model to correctly identify within and between diversification interference under a suite of different diversification scenarios. Methods: Use simulations to assess and compare the power and accuracy of the two algorithms. Results: Diversity dependence within and between clades can be inferred with confidence in a wide range of scenarios using the MCDD model. The two implementations provide comparable results, but the new Horseshoe prior estimator appears to be more reliable, albeit slightly more conservative. The MCDD model is a powerful framework to analyse the putative effects of ecological interactions on macroevolutionary dynamics using fossil data and provides a sound statistical basis for future method developments

    State aggregation for fast likelihood computations in molecular evolution.

    Get PDF
    MOTIVATION: Codon models are widely used to identify the signature of selection at the molecular level and to test for changes in selective pressure during the evolution of genes encoding proteins. The large size of the state space of the Markov processes used to model codon evolution makes it difficult to use these models with large biological datasets. We propose here to use state aggregation to reduce the state space of codon models and, thus, improve the computational performance of likelihood estimation on these models. RESULTS: We show that this heuristic speeds up the computations of the M0 and branch-site models up to 6.8 times. We also show through simulations that state aggregation does not introduce a detectable bias. We analysed a real dataset and show that aggregation provides highly correlated predictions compared to the full likelihood computations. Finally, state aggregation is a very general approach and can be applied to any continuous-time Markov process-based model with large state space, such as amino acid and coevolution models. We therefore discuss different ways to apply state aggregation to Markov models used in phylogenetics. AVAILABILITY: The heuristic is implemented in the godon package (https://bitbucket.org/Davydov/godon) and in a version of FastCodeML (https://gitlab.isb-sib.ch/phylo/fastcodeml). CONTACT: [email protected] SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online

    Why are some species older than others? A large-scale study of vertebrates.

    Get PDF
    BACKGROUND: Strong variations are observed between and within taxonomic groups in the age of extant species and these differences can clarify factors that render species more vulnerable to extinction. Understanding the factors that influence the resilience of species is thus a key component of evolutionary biology, but it is also of prime importance in a context of climate change and for conservation in general. We explored the effect of extrinsic and intrinsic factors on the timing of the oldest diversification event in over 600 vertebrate species distributed worldwide. We used phylogenetic comparative methods to show that color polymorphism, latitude and reproduction (the latter through its interaction with latitude) affected the timing of the oldest diversification event within a species. RESULTS: Species from higher latitudes tended to be younger, and colour-polymorphic species were older than monomorphic species. Mode of reproduction was important also, in that the age of oviparous species decreased with latitude, whereas no pattern was apparent for viviparous species. Organisms which have already persisted for a long time may be more likely to deal with future modifications of their environment. CONCLUSIONS: Species that are colour polymorphic, viviparous, and/or live at low latitudes have exhibited resilience to past environmental changes, and hence may be better able to deal with current climate change

    Taxogenomics of the order Chlamydiales.

    Get PDF
    Bacterial classification is a long-standing problem for taxonomists and species definition itself is constantly debated among specialists. The classification of strict intracellular bacteria such as members of the order Chlamydiales mainly relies on DNA- or protein-based phylogenetic reconstructions because these organisms exhibit few phenotypic differences and are difficult to culture. The availability of full genome sequences allows the comparison of the performance of conserved protein sequences to reconstruct Chlamydiales phylogeny. This approach permits the identification of markers that maximize the phylogenetic signal and the robustness of the inferred tree. In this study, a set of 424 core proteins was identified and concatenated to reconstruct a reference species tree. Although individual protein trees present variable topologies, we detected only few cases of incongruence with the reference species tree, which were due to horizontal gene transfers. Detailed analysis of the phylogenetic information of individual protein sequences (i) showed that phylogenies based on single randomly chosen core proteins are not reliable and (ii) led to the identification of twenty taxonomically highly reliable proteins, allowing the reconstruction of a robust tree close to the reference species tree. We recommend using these protein sequences to precisely classify newly discovered isolates at the family, genus and species levels

    Molecular evolutionary rates are not correlated with temperature and latitude in Squamata: an exception to the metabolic theory of ecology?

    Get PDF
    The metabolic theory of ecology stipulates that molecular evolutionary rates should correlate with temperature and latitude in ectothermic organisms. Previous studies have shown that most groups of vertebrates, such as amphibians, turtles and even endothermic mammals, have higher molecular evolutionary rates in regions where temperature is high. However, the association between molecular evolutionary rates and temperature or latitude has never been tested in Squamata. We used a large dataset including the spatial distributions and environmental variables for 1,651 species of Squamata and compared the contrast of the rates of molecular evolution with the contrast of temperature and latitude between sister species. Using major axis regressions and a new algorithm to choose independent sister species pairs, we found that temperature and absolute latitude were not associated with molecular evolutionary rates. This absence of association in such a diverse ectothermic group questions the mechanisms explaining current pattern of species diversity in Squamata and challenges the presupposed universality of the metabolic theory of ecology

    Linking micro and macroevolution in the presence of migration

    Get PDF
    Understanding macroevolutionary patterns is central to evolutionary biology. This involves the process of divergence within a species, which starts at the microevolutionary level, for instance, when two sub populations evolve towards different phenotypic optima. The speed at which these optima are reached is controlled by the degree of stabilising selection, which pushes the mean trait towards different optima in the different subpopulations, and ongoing migration that pulls the mean phenotype away from that optimum. Traditionally, macro phenotypic evolution is modelled by directional selection processes, but these models usually ignore the role of migration within species. Here, our goal is to reconcile the processes of micro and macroevolution by modelling migration as part of the speciation process. More precisely, we introduce an Ornstein-Uhlenbeck (OU) model where migration happens between two subpopulations within a branch of a phylogeny and this migration decreases over time as it happens during speciation. We then use this model to study the evolution of trait means along a phylogeny, as well as the way phenotypic disparity between species changes with successive epochs. We show that ignoring the effect of migration in sampled time-series data biases significantly the estimation of the selective forces acting upon it. We also show that migration decreases the expected phenotypic disparity between species and we analyse the effect of migration in the particular case of niche filling. We further introduce a method to jointly estimate selection and migration from time-series data. Our model extends traditional quantitative genetics results of selection and migration from a microevolutionary time frame to multiple speciation events at a macroevolutionary scale. Our results further support that not accounting for gene flow has important consequences in inferences at both the micro and macroevolutionary scale. (C) 2019 The Authors. Published by Elsevier Ltd

    Transcriptomic resources for an endemic Neotropical plant lineage (Gesneriaceae).

    Get PDF
    Despite the extensive phenotypic variation that characterizes the Gesneriaceae family, there is a lack of genomic resources to investigate the molecular basis of their diversity. We developed and compared the transcriptomes for two species of the Neotropical lineage of the Gesneriaceae. Illumina sequencing and de novo assembly of floral and leaf samples were used to generate multigene sequence data for Sinningia eumorpha and S. magnifica, two species endemic to the Brazilian Atlantic Forest. A total of 300 million reads were used to assemble the transcriptomes, with an average of 92,038 transcripts and 43,506 genes per species. The transcriptomes showed good quality metrics, with the presence of all eukaryotic core genes, and an equal representation of clusters of orthologous groups (COG) classifications between species. The orthologous search produced 8602 groups, with 15-20% of them annotated using BLAST tools. This study provides the first step toward a comprehensive multispecies transcriptome characterization of the Gesneriaceae family. These resources are the basis for comparative analyses in this species-rich Neotropical plant group; they will also allow the investigation of the evolutionary importance of multiple metabolic pathways and phenotypic diversity, as well as developmental programs in these nonmodel species
    corecore