818 research outputs found

    The impact of the rate prior on Bayesian estimation of divergence times with multiple Loci.

    Get PDF
    Bayesian methods provide a powerful way to estimate species divergence times by combining information from molecular sequences with information from the fossil record. With the explosive increase of genomic data, divergence time estimation increasingly uses data of multiple loci (genes or site partitions). Widely used computer programs to estimate divergence times use independent and identically distributed (i.i.d.) priors on the substitution rates for different loci. The i.i.d. prior is problematic. As the number of loci (L) increases, the prior variance of the average rate across all loci goes to zero at the rate 1/L. As a consequence, the rate prior dominates posterior time estimates when many loci are analyzed, and if the rate prior is misspecified, the estimated divergence times will converge to wrong values with very narrow credibility intervals. Here we develop a new prior on the locus rates based on the Dirichlet distribution that corrects the problematic behavior of the i.i.d. prior. We use computer simulation and real data analysis to highlight the differences between the old and new priors. For a dataset for six primate species, we show that with the old i.i.d. prior, if the prior rate is too high (or too low), the estimated divergence times are too young (or too old), outside the bounds imposed by the fossil calibrations. In contrast, with the new Dirichlet prior, posterior time estimates are insensitive to the rate prior and are compatible with the fossil calibrations. We re-analyzed a phylogenomic data set of 36 mammal species and show that using many fossil calibrations can alleviate the adverse impact of a misspecified rate prior to some extent. We recommend the use of the new Dirichlet prior in Bayesian divergence time estimation. [Bayesian inference, divergence time, relaxed clock, rate prior, partition analysis.].This work was supported by Biotechnology and Biological Sciences Research Council (BBSRC), UK, grant BB/J009709/1. Z.Y. is a Royal Society Wolfson Merit award holder. T.Z. is supported by Natural Science Foundation of China (NSF) grants (31301093, 11301294 and 11201224)

    Estimation of divergence times for major lineages of galliform birds: Evidence from complete mitochondrial genome sequences

    Get PDF
    Determining an absolute timescale for avian evolutionary history has been recently challenged by the relaxed molecular clock methods, that rates of molecular evolution can vary significantly among organisms. In this study, we used relaxed molecular clocks to date the divergence of major lineages of Galliformes based on complete mitochondrial genomes. A nucleotide dataset of 13 concatenated protein-coding genes from 22 species of Galliformes was used to investigate the evolutionary divergences within the group. Using Gallus bravardi, Schaubortyx and Gallinuloides fossils ascalibration points, divergence times analyses were performed with four relaxed molecular clock methods as follows: (1) Bayesian method of Multidivtime; (2) Bayesian Markov chain Monte Carlo (MCMC) analysis of the Bayesian evolutionary analysis by sampling trees (BEAST); (3) local rate minimum deformation method (LRMD) of TREEFINDER; and (4) nonparametric rate smoothing (NPRS) of TREEFINDER. The various relaxed clock methods all indicated that (1) Megapodiidae originated in theLate Cretaceous; (2) Numididae, Phasianidae, Arborophilinae and Coturnicinae originated in the Eocene of Palaeogene; (3) Pavoninae and Gallininae originated at the Eocene-Oligocene boundary; (4) Phasianinae and Meleagridinae originated in the Oligocene; (5) divergence times estimation among most genera of Phasianidae were much older than those of the previous studies. Our results might provide a more likely time scale for evolutionary history of the galliform birds

    Rate variation and estimation of divergence times using strict and relaxed clocks

    Get PDF
    Background Understanding causes of biological diversity may be greatly enhanced by knowledge of divergence times. Strict and relaxed clock models are used in Bayesian estimation of divergence times. We examined whether: i) strict clock models are generally more appropriate in shallow phylogenies where rate variation is expected to be low, ii) the likelihood ratio test of the clock (LRT) reliably informs which model is appropriate for dating divergence times. Strict and relaxed models were used to analyse sequences simulated under different levels of rate variation. Published shallow phylogenies (Black bass, Primate-sucking lice, Podarcis lizards, Gallotiinae lizards, and Caprinae mammals) were also analysed to determine natural levels of rate variation relative to the performance of the different models. Results Strict clock analyses performed well on data simulated under the independent rates model when the standard deviation of log rate on branches, σ, was low (≤0.1), but were inappropriate when σ>0.1 (95% of rates fall within 0.0082-0.0121 subs/site/Ma when σ = 0.1, for a mean rate of 0.01). The independent rates relaxed clock model performed well at all levels of rate variation, although posterior intervals on times were significantly wider than for the strict clock. The strict clock is therefore superior when rate variation is low. The performance of a correlated rates relaxed clock model was similar to the strict clock. Increased numbers of independent loci led to slightly narrower posteriors under the relaxed clock while older root ages provided proportionately narrower posteriors. The LRT had low power for σ = 0.01-0.1, but high power for σ = 0.5-2.0. Posterior means of σ2 were useful for assessing rate variation in published datasets. Estimates of natural levels of rate variation ranged from 0.05-3.38 for different partitions. Differences in divergence times between relaxed and strict clock analyses were greater in two datasets with higher σ2 for one or more partitions, supporting the simulation results. Conclusions The strict clock can be superior for trees with shallow roots because of low levels of rate variation between branches. The LRT allows robust assessment of suitability of the clock model as does examination of posteriors on σ2

    Phylogenetic Analyses: A Toolbox Expanding towards Bayesian Methods

    Get PDF
    The reconstruction of phylogenies is becoming an increasingly simple activity. This is mainly due to two reasons: the democratization of computing power and the increased availability of sophisticated yet user-friendly software. This review describes some of the latest additions to the phylogenetic toolbox, along with some of their theoretical and practical limitations. It is shown that Bayesian methods are under heavy development, as they offer the possibility to solve a number of long-standing issues and to integrate several steps of the phylogenetic analyses into a single framework. Specific topics include not only phylogenetic reconstruction, but also the comparison of phylogenies, the detection of adaptive evolution, and the estimation of divergence times between species

    Impact of the Partitioning Scheme on Divergence Times Inferred from Mammalian Genomic Data Sets

    Get PDF
    Data partitioning has long been regarded as an important parameter for phylogenetic inference. The division of heterogeneous multigene data sets into partitions with similar substitution patterns is known to increase the performance of probabilistic phylogenetic methods. However, the effect of the partitioning scheme on divergence time estimates has generally been ignored. To investigate the impact of data partitioning on the estimation of divergence times, we have constructed two genomic data sets. The first one with 15 nuclear genes comprising 50,928 bp were selected from the OrthoMam database; the second set was composed of complete mitochondrial genomes. We studied two partitioning schemes: concatenated supermatrices and partitioned gene analysis. We have also measured the impact of taxonomic sampling on the estimates. After drawing divergence time inferences using the uncorrelated relaxed clock in BEAST, we have compared the age estimates between the partitioning schemes. Our results show that, in general, both schemes resulted in similar chronological estimates, however the concatenated data sets were more efficient than the partitioned ones in attaining suitable effective sample sizes

    Evolutionary history of the Pectoral Sparrow Arremon taciturnus : evidence for diversification during the Late Pleistocene

    Get PDF
    We focus on reconstructing a spatiotemporal scenario of diversification of a widespread South American species, the Pectoral Sparrow Arremon taciturnus (Aves: Passerellidae). This species is widely distributed in both the humid and the dry forests of South America and therefore provides an interesting model for understanding the connection between different biomes of South America. We examined nucleotide sequences of the mitochondrial genes Cytochrome b (cyt-b) and NADH subunit 2 (ND2) from 107 specimens, and one nuclear marker (intron 7 of the beta-fibrinogen gene) from a subset of samples collected across the distribution ranges of A. t. taciturnus and A. t. nigrirostris. Six major lineages were recovered in the phylogenies that displayed high levels of variance of allele frequencies and corresponded to distinct geographical locations. The estimation of divergence times provided evidence that diversification of the six lineages of the Pectoral Sparrow occurred throughout the Late Pleistocene across major cis-Andean biomes and Amazonian interfluves. Our dataset for A. taciturnus provides further evidence that rivers in Amazonia constitute barriers promoting allopatric speciation, with occasional sharing of alleles among lineages, particularly those with adjacent distributions.Peer reviewe

    Historical biogeography of the squids from the family Loliginidae (Teuthoidea: Myopsida)

    Get PDF
    Indexación: Scopus.According to the vicariant hypothesis proposed by Brakoniecki (1986) the closure of the Sea of Tethys and the opening of the Atlantic Ocean would play an important role in the history of squids of the family Loliginidae, which is reflected in its current neritic distribution. Our study evaluated this hypothesis and alternative ideas to understand the historical biogeography of loliginid squids. This work is based on a phylogenetic hypothesis rebuilt with mitochondrial and nuclear sequences that incorporates the estimation of divergence times and ancestral distribution. Our results sustain that the squids of the family Loliginidae would have originated in the Western Pacific during the Late Paleocene about 59 My, following, during their diversification, at least 20 dispersion and 6 vicariant events. The first vicariant event fragments the ancestral distribution, remaining the ancestor of Sepioteuthis in the south and the subfamily Loligininae in the north. Successive events of dispersion, and some of vicariance (unrelated with the movement of tectonic plates and opening of the Atlantic Ocean), modeled it distribution. Our inference suggest a different origin compared to proposed by Brakoniecki (Tethys Sea), consistent with a center of origin that supports the most diversity of the family, with a predominance of dispersion processes over vicariant events, which explain the present distribution pattern.http://www.lajar.cl/pdf/imar/v45n1/Art%C3%ADculo_45_1_11.pd

    The effect of fossil sampling on the estimation of divergence times with the fossilised birth death process

    Get PDF
    Timescales are of fundamental importance to evolutionary biology as they facilitate hypothesis tests of historical evolutionary processes. Through the incorporation of fossil occurrence data, the fossilised birth-death (FBD) process provides a framework for estimating divergence times using more palaeontological data than traditional node calibration approaches have allowed. The inclusion of more data can refine evolutionary timescale estimates, but for many taxonomic groups it is computationally infeasible to include all fossil occurrence data. Here, we utilise both empirical data and a simulation framework to identify approaches to subsampling fossil occurrence data that result in the most accurate estimates of divergence times. To achieve this we assess the performance of the FBD-Skyline model when implementing multiple approaches to incorporating subsampled fossil occurrences. Our results demonstrate that it is necessary to account for all available fossil occurrence data to achieve the most accurate estimates of clade age. We show that this can be achieved if an empirical Bayes approach to account for fossil sampling through time is applied to the FBD process. Random subsampling of occurrence data can lead to estimates of clade age that are incompatible with fossil evidence if no control over the affinities of fossil occurrences is enforced. Our results call into question the accuracy of previous divergence time studies incorporating the FBD process that have used only a subsample of all available fossil occurrence data.Supplementary Figure 1Median age estimates and 95% HPDs obtained to demonstrate the behaviour of the FBD process without the subsampling of fossil occurrences. These results demonstrate the suitability of the simulation framework employed for subsequent analyses.Positive_Control.pdfSupplementary Figure 2Median age estimates and 95% HPDs for Hymenoptera obtained using a range of approaches to constructing a subsample of fossil occurrences and constraining their placement. These approaches consist of: a uniform subsample of occurrences with and without topological constraints, a uniform subsample of occurrences supplemented with the oldest unequivocal members of clades which were then constrained to their respective crown groups, a subsample consisting of only the oldest unequivocal members of clades with and without topological constraints applied to constrain them to their respective crown groups.Supp_Empirical.PDFSupplementary Figure 3The accuracy and precision of estimated node ages obtained from subsamples of 100 replicate fossil occurrence datasets after the addition of the oldest occurrences for each clade to the subsample. These occurrences were then topologically constrained to lineages that descend from the node either 1, 2, or 4 nodes below the direct ancestor of the occurrence. Each point represents the median posterior age estimate of one clade, with grey bars representing the 95% HPD for that node age estimate. When occurrences are placed one node below their direct ancestor an approach in which the rate of fossil sampling is estimated produces the greatest accuracy. When occurrences are placed with reduced accuracy then the accuracy of age estimates when sampling rate is an estimated parameter of the FBD process decreases. Conversely, fixing the rate of fossil sampling or placing an informed prior on this parameter improves the accuracy as fossils are placed with reduced accuracy. For all cases in which fossils occurrences are placed at a node that is lower than their true ancestral fossil the 95% HPDs of age estimates extend to ages that violate the minimum age of the clade, as implied by the complete sample of fossil occurrences.Drop_Results.ep

    Using the Fossil Record to Evaluate Timetree Timescales.

    Get PDF
    The fossil and geologic records provide the primary data used to established absolute timescales for timetrees. For the paleontological evaluation of proposed timetree timescales, and for node-based methods for constructing timetrees, the fossil record is used to bracket divergence times. Minimum brackets (minimum ages) can be established robustly using well-dated fossils that can be reliably assigned to lineages based on positive morphological evidence. Maximum brackets are much harder to establish, largely because it is difficult to establish definitive evidence that the absence of a taxon in the fossil record is real and not just due to the incompleteness of the fossil and rock records. Five primary methods have been developed to estimate maximum age brackets, each of which is discussed. The fact that the fossilization potential of a group typically decreases the closer one approaches its time of origin increases the challenge of estimating maximum age brackets. Additional complications arise: 1) because fossil data actually bracket the time of origin of the first relevant fossilizable morphology (apomorphy), not the divergence time itself; 2) due to the phylogenetic uncertainty in the placement of fossils; 3) because of idiosyncratic temporal and geographic gaps in the rock and fossil records; and 4) if the preservation potential of a group changed significantly during its history. In contrast, uncertainties in the absolute ages of fossils are typically relatively unimportant, even though the vast majority of fossil cannot be dated directly. These issues and relevant quantitative methods are reviewed, and their relative magnitudes assessed, which typically correlate with the age of the group, its geographic range, and species richness
    corecore