284 research outputs found

    Novel non-parametric models to estimate evolutionary rates and divergence times from heterochronous sequence data

    Get PDF
    Background: Early methods for estimating divergence times from gene sequence data relied on the assumption of a molecular clock. More sophisticated methods were created to model rate variation and used auto-correlation of rates, local clocks, or the so called "uncorrelated relaxed clock" where substitution rates are assumed to be drawn from a parametric distribution. In the case of Bayesian inference methods the impact of the prior on branching times is not clearly understood, and if the amount of data is limited the posterior could be strongly influenced by the prior. Results: We develop a maximum likelihood method - Physher - that uses local or discrete clocks to estimate evolutionary rates and divergence times from heterochronous sequence data. Using two empirical data sets we show that our discrete clock estimates are similar to those obtained by other methods, and that Physher outperformed some methods in the estimation of the root age of an influenza virus data set. A simulation analysis suggests that Physher can outperform a Bayesian method when the real topology contains two long branches below the root node, even when evolution is strongly clock-like. Conclusions: These results suggest it is advisable to use a variety of methods to estimate evolutionary rates and divergence times from heterochronous sequence data. Physher and the associated data sets used here are available online at. © 2014 Fourment and Holmes; licensee BioMed Central Ltd

    Seqotron: A user-friendly sequence editor for Mac OS X

    Full text link
    © 2016 Fourment and Holmes. Background: Accurate multiple sequence alignment is central to bioinformatics and molecular evolutionary analyses. Although sophisticated sequence alignment programs are available, manual adjustments are often required to improve alignment quality. Unfortunately, few programs offer a simple and intuitive way to edit sequence alignments. Results: We present Seqotron, a sequence editor that reads and writes files in a wide variety of sequence formats. Sequences can be easily aligned and manually edited using the mouse and keyboard. The program also allows the user to estimate both phylogenetic trees and distance matrices. Conclusions: Seqotron will benefit researchers who need to manipulate and align complex sequence data. Seqotron is a Mac OS X compatible open source project and is available from Github https://github.com/4ment/seqotron/

    Local and relaxed clocks, the best of both worlds

    Get PDF
    Time-resolved phylogenetic methods use information about the time of sample collection to estimate the rate of evolution. Originally, the models used to estimate evolutionary rates were quite simple, assuming that all lineages evolve at the same rate, an assumption commonly known as the molecular clock. Richer and more complex models have since been introduced to capture the phenomenon of substitution rate variation among lineages. Two well known model extensions are the local clock, wherein all lineages in a clade share a common substitution rate, and the uncorrelated relaxed clock, wherein the substitution rate on each lineage is independent from other lineages while being constrained to fit some parametric distribution. We introduce a further model extension, called the flexible local clock (FLC), which provides a flexible framework to combine relaxed clock models with local clock models. We evaluate the flexible local clock on simulated and real datasets and show that it provides substantially improved fit to an influenza dataset. An implementation of the model is available for download from https://www.github.com/4ment/flc

    The impact of migratory flyways on the spread of avian influenza virus in North America

    Full text link
    © 2017 The Author(s). Background: Wild birds are the major reservoir hosts for influenza A viruses (AIVs) and have been implicated in the emergence of pandemic events in livestock and human populations. Understanding how AIVs spread within and across continents is therefore critical to the development of successful strategies to manage and reduce the impact of influenza outbreaks. In North America many bird species undergo seasonal migratory movements along a North-South axis, thereby providing opportunities for viruses to spread over long distances. However, the role played by such avian flyways in shaping the genetic structure of AIV populations remains uncertain. Results: To assess the relative contribution of bird migration along flyways to the genetic structure of AIV we performed a large-scale phylogeographic study of viruses sampled in the USA and Canada, involving the analysis of 3805 to 4505 sequences from 36 to 38 geographic localities depending on the gene segment data set. To assist in this we developed a maximum likelihood-based genetic algorithm to explore a wide range of complex spatial models, depicting a more complete picture of the migration network than determined previously. Conclusions: Based on phylogenies estimated from nucleotide sequence data sets, our results show that AIV migration rates are significantly higher within than between flyways, indicating that the migratory patterns of birds play a key role in viral dispersal. These findings provide valuable insights into the evolution, maintenance and transmission of AIVs, in turn allowing the development of improved programs for surveillance and risk assessment

    'Tannat' (Vitis vinifera L.) as a model of responses to climate variability

    Get PDF
    Climate variability influence on the vine is widely studied for its impact on grape final composition and quality. During 1994-2016, thermal and water regimes and their influence on grapevine yield, sanitary status and berry composition were analyzed for 'Tannat' grown in commercial vineyards in the south of Uruguay (Lat 34° 37' S; 56° 17' W). Statistical analysis showed that the principal component analysis (PCA) separated years in three groups: Group 1: rainfall over the growing season higher than the average, limited sanitary status, acidity and yield higher than average, lower sugar content, late harvest. Group 2: greater thermal conditions and water component lower than average, better sanitary status, sugar contents and acidity lower than average, early harvest. Group 3: thermal conditions lower than average, rainfall higher during budbreak-fruitset period and lower than average in the month before harvest, berry size and sugar contents greater than average. Correlations between climate, yield and berry quality variables were established and stages of greater sensitivity to these climate elements were determined. In the studied years, climate variability within the region was high and 'Tannat' showed to be strongly influenced by such variability

    A Surrogate Function for One-Dimensional Phylogenetic Likelihoods

    Full text link
    © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: [email protected]. Phylogenetics has seen a steady increase in data set size and substitution model complexity, which require increasing amounts of computational power to compute likelihoods. This motivates strategies to approximate the likelihood functions for branch length optimization and Bayesian sampling. In this article, we develop an approximation to the 1D likelihood function as parametrized by a single branch length. Our method uses a four-parameter surrogate function abstracted from the simplest phylogenetic likelihood function, the binary symmetric model. We show that it offers a surrogate that can be fit over a variety of branch lengths, that it is applicable to a wide variety of models and trees, and that it can be used effectively as a proposal mechanism for Bayesian sampling. The method is implemented as a stand-Alone open-source C library for calling from phylogenetics algorithms; it has proven essential for good performance of our online phylogenetic algorithm sts

    La vid (Vitis vinífera L. cv. Tannat) como indicadora del cambio climático: el caso de Uruguay

    Get PDF
    Es conocido el hecho que las plantas responden a las condiciones climáticas del año. La vid es particularmente sensible a las temperaturas diurnas y nocturnas así como al régimen hídrico que se expresa en la respuesta de la planta: variación en la duración de los estados fenológicos como la maduración, composición de la uva o en su sanidad. El objetivo de este estudio es mostrar a través de la evolución de índices bioclimáticos adaptados a la vid la variabilidad climática y para los últimos quince años, analizar la respuesta del cultivo al clima, de manera de poder considerar a la vid como posible indicadora del cambio y la variabilidad climática. Para confirmar esta hipótesis se presentan resultados provenientes de una serie de quince años de parcelas de experimentación de la variedad Tannat de viñedos implantados en el sur del Uruguay en el que se relacionan los factores del clima con la respuesta de la planta.Eje: Clima.Universidad Nacional de La Plat

    La vid (Vitis vinífera L. cv. Tannat) como indicadora del cambio climático: el caso de Uruguay

    Get PDF
    Es conocido el hecho que las plantas responden a las condiciones climáticas del año. La vid es particularmente sensible a las temperaturas diurnas y nocturnas así como al régimen hídrico que se expresa en la respuesta de la planta: variación en la duración de los estados fenológicos como la maduración, composición de la uva o en su sanidad. El objetivo de este estudio es mostrar a través de la evolución de índices bioclimáticos adaptados a la vid la variabilidad climática y para los últimos quince años, analizar la respuesta del cultivo al clima, de manera de poder considerar a la vid como posible indicadora del cambio y la variabilidad climática. Para confirmar esta hipótesis se presentan resultados provenientes de una serie de quince años de parcelas de experimentación de la variedad Tannat de viñedos implantados en el sur del Uruguay en el que se relacionan los factores del clima con la respuesta de la planta.Eje: Clima.Universidad Nacional de La Plat

    Trial by phylogenetics - Evaluating the Multi-Species Coalescent for phylogenetic inference on taxa with high levels of paralogy (Gonyaulacales, Dinophyceae)

    Full text link
    ABSTRACT From publicly available next-gen sequencing datasets of non-model organisms, such as marine protists, arise opportunities to explore their evolutionary relationships. In this study we explored the effects that dataset and model selection have on the phylogenetic inference of the Gonyaulacales, single celled marine algae of the phylum Dinoflagellata with genomes that show extensive paralogy. We developed a method for identifying and extracting single copy genes from RNA-seq libraries and compared phylogenies inferred from these single copy genes with those inferred from commonly used genetic markers and phylogenetic methods. Comparison of two datasets and three different phylogenetic models showed that exclusive use of ribosomal DNA sequences, maximum likelihood and gene concatenation showed very different results to that obtained with the multi-species coalescent. The multi-species coalescent has recently been recognized as being robust to the inclusion of paralogs, including hidden paralogs present in single copy gene sets (pseudoorthologs). Comparisons of model fit strongly favored the multi-species coalescent for these data, over a concatenated alignment (single tree) model. Our findings suggest that the multi-species coalescent (inferred either via Maximum Likelihood or Bayesian Inference) should be considered for future phylogenetic studies of organisms where accurate selection of orthologs is difficult

    19 Dubious Ways to Compute the Marginal Likelihood of a Phylogenetic Tree Topology.

    Full text link
    The marginal likelihood of a model is a key quantity for assessing the evidence provided by the data in support of a model. The marginal likelihood is the normalizing constant for the posterior density, obtained by integrating the product of the likelihood and the prior with respect to model parameters. Thus, the computational burden of computing the marginal likelihood scales with the dimension of the parameter space. In phylogenetics, where we work with tree topologies that are high-dimensional models, standard approaches to computing marginal likelihoods are very slow. Here, we study methods to quickly compute the marginal likelihood of a single fixed tree topology. We benchmark the speed and accuracy of 19 different methods to compute the marginal likelihood of phylogenetic topologies on a suite of real data sets under the JC69 model. These methods include several new ones that we develop explicitly to solve this problem, as well as existing algorithms that we apply to phylogenetic models for the first time. Altogether, our results show that the accuracy of these methods varies widely, and that accuracy does not necessarily correlate with computational burden. Our newly developed methods are orders of magnitude faster than standard approaches, and in some cases, their accuracy rivals the best established estimators
    • …
    corecore