64 research outputs found

    Using minimum bootstrap support for splits to construct confidence regions for trees

    Get PDF
    Many of the estimated topologies in phylogenetic studies are presented with the bootstrap support for each of the splits in the topology indicated. If phylogenetic estimation is unbiased, high bootstrap support for a split suggests that there is a good deal of certainty that the split actually is present in the tree and low bootstrap support suggests that one or more of the taxa on one side of the estimated split might in reality be located with taxa on the other side. In the latter case the follow-up questions about how many and which of the taxa could reasonably be incorrectly placed as well as where they might alternatively be placed are not addressed through the presented bootstrap support. We present here an algorithm that finds the set of all trees with minimum bootstrap support for their splits greater than some given value. The output is a ranked list of trees, ranked according to the minimum bootstrap supports for splits in the trees. The number of such trees and their topologies provides useful supplementary information in bootstrap analyses about the reasons for low bootstrap support for splits. We also present ways of quantifying low bootstrap support by considering the set of all topologies with minimum bootstrap greater than some quantity as providing a confidence region of topologies. Using a double bootstrap we are able to choose a cutoff so that the set of topologies with minimum bootstrap support for a split greater than that cutoff gives an approximate 95% confidence region. As with bootstrap support one advantage of the methods is that they are generally applicable to the wide variety of phylogenetic estimation methods

    PROCOV: maximum likelihood estimation of protein phylogeny under covarion models and site-specific covarion pattern analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The covarion hypothesis of molecular evolution holds that selective pressures on a given amino acid or nucleotide site are dependent on the identity of other sites in the molecule that change throughout time, resulting in changes of evolutionary rates of sites along the branches of a phylogenetic tree. At the sequence level, covarion-like evolution at a site manifests as conservation of nucleotide or amino acid states among some homologs where the states are not conserved in other homologs (or groups of homologs). Covarion-like evolution has been shown to relate to changes in functions at sites in different clades, and, if ignored, can adversely affect the accuracy of phylogenetic inference.</p> <p>Results</p> <p>PROCOV (protein covarion analysis) is a software tool that implements a number of previously proposed covarion models of protein evolution for phylogenetic inference in a maximum likelihood framework. Several algorithmic and implementation improvements in this tool over previous versions make computationally expensive tree searches with covarion models more efficient and analyses of large phylogenomic data sets tractable. PROCOV can be used to identify covarion sites by comparing the site likelihoods under the covarion process to the corresponding site likelihoods under a rates-across-sites (RAS) process. Those sites with the greatest log-likelihood difference between a 'covarion' and an RAS process were found to be of functional or structural significance in a dataset of bacterial and eukaryotic elongation factors.</p> <p>Conclusion</p> <p>Covarion models implemented in PROCOV may be especially useful for phylogenetic estimation when ancient divergences between sequences have occurred and rates of evolution at sites are likely to have changed over the tree. It can also be used to study lineage-specific functional shifts in protein families that result in changes in the patterns of site variability among subtrees.</p

    Cellular costs underpin micronutrient limitation in phytoplankton

    Get PDF
    Micronutrients control phytoplankton growth in the ocean, influencing carbon export and fisheries. It is currently unclear how micronutrient scarcity affects cellular processes and how interdependence across micronutrients arises. We show that proximate causes of micronutrient growth limitation and interdependence are governed by cumulative cellular costs of acquiring and using micronutrients. Using a mechanistic proteomic allocation model of a polar diatom focused on iron and manganese, we demonstrate how cellular processes fundamentally underpin micronutrient limitation, and how they interact and compensate for each other to shape cellular elemental stoichiometry and resource interdependence. We coupled our model with metaproteomic and environmental data, yielding an approach for estimating biogeochemical metrics, including taxon-specific growth rates. Our results show that cumulative cellular costs govern how environmental conditions modify phytoplankton growth

    Posterior summarisation in Bayesian phylogenetics using Tracer 1.7

    Get PDF
    Bayesian inference of phylogeny using Markov chain Monte Carlo (MCMC) plays a central role in understanding evolutionary history from molecular sequence data. Visualizing and analyzing the MCMC-generated samples from the posterior distribution is a key step in any non-trivial Bayesian inference. We present the software package Tracer (version 1.7) for visualizing and analyzing the MCMC trace files generated through Bayesian phylogenetic inference. Tracer provides kernel density estimation, multivariate visualization, demographic trajectory reconstruction, conditional posterior distribution summary, and more. Tracer is open-source and available at http://beast.community/tracer.status: publishe

    Reproducing the manual annotation of multiple sequence alignments using a SVM classifier

    Get PDF
    Motivation: Aligning protein sequences with the best possible accuracy requires sophisticated algorithms. Since the optimal alignment is not guaranteed to be the correct one, it is expected that even the best alignment will contain sites that do not respect the assumption of positional homology. Because formulating rules to identify these sites is difficult, it is common practice to manually remove them. Although considered necessary in some cases, manual editing is time consuming and not reproducible. We present here an automated editing method based on the classification of ‘valid’ and ‘invalid’ sites

    Data from: Bayesian long branch attraction bias and corrections

    No full text
    Previous work on the star-tree paradox has shown that Bayesian methods suffer from a long branch attraction bias. That work is extended to settings involving more taxa and partially resolved trees. The long branch attraction bias is confirmed to arise more broadly and an additional source of bias is found. A by-product of the analysis is methods that correct for biases toward particular topologies. The corrections can be easily calculated using existing Bayesian software. Posterior support for a set of two or more trees can thus be supplemented with corrected versions to cross-check or replace results. Simulations show the corrections to be highly effective

    Supplementary Material for Bayesian Long-Branch Attraction Bias and Corrections

    No full text
    Supplementary Material for Bayesian Long-Branch Attraction Bias and Correction
    corecore