275 research outputs found

    When trees grow too long: Investigating the causes of highly inaccurate bayesian branch-length estimates

    Get PDF
    A surprising number of recent Bayesian phylogenetic analyses contain branch-length estimates that are several orders of magnitude longer than corresponding maximum-likelihood estimates. The levels of divergence implied by such branch lengths are unreasonable for studies using biological data and are known to be false for studies using simulated data. We conducted additional Bayesian analyses and studied approximate-posterior surfaces to investigate the causes underlying these large errors. We manipulated the starting parameter values of the Markov chain Monte Carlo (MCMC) analyses, the moves used by the MCMC analyses, and the prior-probability distribution on branch lengths. We demonstrate that inaccurate branch-length estimates result from either 1) poor mixing of MCMC chains or 2) posterior distributions with excessive weight at long tree lengths. Both effects are caused by a rapid increase in the volume of branch-length space as branches become longer. In the former case, both an MCMC move that scales all branch lengths in the tree simultaneously and the use of overdispersed starting branch lengths allow the chain to accurately sample the posterior distribution and should be used in Bayesian analyses of phylogeny. In the latter case, branch-length priors can have strong effects on resulting inferences and should be carefully chosen to reflect biological expectations. We provide a formula to calculate an exponential rate parameter for the branch-length prior that should eliminate inference of biased branch lengths in many cases. In any phylogenetic analysis, the biological plausibility of branch-length output must be carefully considered

    Comparing species tree estimation with large anchored phylogenomic and small Sanger-sequenced molecular datasets: an empirical study on Malagasy pseudoxyrhophiine snakes

    Full text link
    Background Using molecular data generated by high throughput next generation sequencing (NGS) platforms to infer phylogeny is becoming common as costs go down and the ability to capture loci from across the genome goes up. While there is a general consensus that greater numbers of independent loci should result in more robust phylogenetic estimates, few studies have compared phylogenies resulting from smaller datasets for commonly used genetic markers with the large datasets captured using NGS. Here, we determine how a 5-locus Sanger dataset compares with a 377-locus anchored genomics dataset for understanding the evolutionary history of the pseudoxyrhophiine snake radiation centered in Madagascar. The Pseudoxyrhophiinae comprise ~86 % of Madagascar’s serpent diversity, yet they are poorly known with respect to ecology, behavior, and systematics. Using the 377-locus NGS dataset and the summary statistics species-tree methods STAR and MP-EST, we estimated a well-supported species tree that provides new insights concerning intergeneric relationships for the pseudoxyrhophiines. We also compared how these and other methods performed with respect to estimating tree topology using datasets with varying numbers of loci. Methods Using Sanger sequencing and an anchored phylogenomics approach, we sequenced datasets comprised of 5 and 377 loci, respectively, for 23 pseudoxyrhophiine taxa. For each dataset, we estimated phylogenies using both gene-tree (concatenation) and species-tree (STAR, MP-EST) approaches. We determined the similarity of resulting tree topologies from the different datasets using Robinson-Foulds distances. In addition, we examined how subsets of these data performed compared to the complete Sanger and anchored datasets for phylogenetic accuracy using the same tree inference methodologies, as well as the program *BEAST to determine if a full coalescent model for species tree estimation could generate robust results with fewer loci compared to the summary statistics species tree approaches. We also examined the individual gene trees in comparison to the 377-locus species tree using the program MetaTree. Results Using the full anchored dataset under a variety of methods gave us the same, well-supported phylogeny for pseudoxyrhophiines. The African pseudoxyrhophiine Duberria is the sister taxon to the Malagasy pseudoxyrhophiines genera, providing evidence for a monophyletic radiation in Madagascar. In addition, within Madagascar, the two major clades inferred correspond largely to the aglyphous and opisthoglyphous genera, suggesting that feeding specializations associated with tooth venom delivery may have played a major role in the early diversification of this radiation. The comparison of tree topologies from the concatenated and species-tree methods using different datasets indicated the 5-locus dataset cannot beused to infer a correct phylogeny for the pseudoxyrhophiines under any method tested here and that summary statistics methods require 50 or more loci to consistently recover the species-tree inferred using the complete anchored dataset. However, as few as 15 loci may infer the correct topology when using the full coalescent species tree method *BEAST. MetaTree analyses of each gene tree from the Sanger and anchored datasets found that none of the individual gene trees matched the 377-locus species tree, and that no gene trees were identical with respect to topology. Conclusions Our results suggest that ≥50 loci may be necessary to confidently infer phylogenies when using summaryspecies-tree methods, but that the coalescent-based method *BEAST consistently recovers the same topology using only 15 loci. These results reinforce that datasets with small numbers of markers may result in misleading topologies, and further, that the method of inference used to generate a phylogeny also has a major influence on the number of loci necessary to infer robust species trees. Electronic supplementary material The online version of this article (doi:10.1186/s12862-015-0503-1) contains supplementary material, which is available to authorized users

    Quantifying the spatiotemporal dynamics in a chorus frog (Pseudacris) hybrid zone over 30 years

    Get PDF
    © 2016 The Authors. Ecology and Evolution published by John Wiley & Sons Ltd. Although theory suggests that hybrid zones can move or change structure over time, studies supported by direct empirical evidence for these changes are relatively limited. We present a spatiotemporal genetic study of a hybrid zone between Pseudacris nigrita and P. fouquettei across the Pearl River between Louisiana and Mississippi. This hybrid zone was initially characterized in 1980 as a narrow and steep “tension zone,” in which hybrid populations were inferior to parentals and were maintained through a balance between selection and dispersal. We reanalyzed historical tissue samples and compared them to samples of recently collected individuals using microsatellites. Clinal analyses indicate that the cline has not shifted in roughly 30 years but has widened significantly. Anthropogenic and natural changes may have affected selective pressure or dispersal, and our results suggest that the zone may no longer best be described as a tension zone. To the best of our knowledge, this study provides the first evidence of significant widening of a hybrid cline but stasis of its center. Continued empirical study of dynamic hybrid zones will provide insight into the forces shaping their structure and the evolutionary potential they possess for the elimination or generation of species

    Polyploidy breaks speciation barriers in Australian burrowing frogs Neobatrachus

    Get PDF
    Polyploidy has played an important role in evolution across the tree of life but it is still unclear how polyploid lineages may persist after their initial formation. While both common and well-studied in plants, polyploidy is rare in animals and generally less understood. The Australian burrowing frog genus Neobatrachus is comprised of six diploid and three polyploid species and offers a powerful animal polyploid model system. We generated exome-capture sequence data from 87 individuals representing all nine species of Neobatrachus to investigate species-level relationships, the origin and inheritance mode of polyploid species, and the population genomic effects of polyploidy on genus-wide demography. We describe rapid speciation of diploid Neobatrachus species and show that the three independently originated polyploid species have tetrasomic or mixed inheritance. We document higher genetic diversity in tetraploids, resulting from widespread gene flow between the tetraploids, asymmetric inter-ploidy gene flow directed from sympatric diploids to tetraploids, and isolation of diploid species from each other. We also constructed models of ecologically suitable areas for each species to investigate the impact of climate on differing ploidy levels. These models suggest substantial change in suitable areas compared to past climate, which correspond to population genomic estimates of demographic histories. We propose that Neobatrachus diploids may be suffering the early genomic impacts of climate-induced habitat loss, while tetraploids appear to be avoiding this fate, possibly due to widespread gene flow. Finally, we demonstrate that Neobatrachus is an attractive model to study the effects of ploidy on the evolution of adaptation in animals

    A pilot study applying the plant Anchored Hybrid Enrichment method to New World sages (Salvia subgenus Calosphace; Lamiaceae)

    Get PDF
    We conducted a pilot study using Anchored Hybrid Enrichment to resolve relationships among a mostly Neotropical sage lineage that may have undergone a recent evolutionary radiation. Conventional markers (ITS, trnL-trnF and trnH-psbA) have not been able to resolve the relationships among species nor within portions of the backbone of the lineage. We sampled 12 representative species of subgenus Calosphace and included one species of Salvia´s s.l. closest relative, Lepechinia, as outgroup. Hybrid enrichment and sequencing were successful, yielding 448 alignments of individual loci with an average length of 704. bp. The performance of the phylogenomic data in phylogenetic reconstruction was superior to that of conventional markers, increasing both support and resolution. Because the captured loci vary in the amount of net phylogenetic informativeness at different phylogenetic depths, these data are promising in phylogenetic reconstruction of this group and likely other lineages within Lamiales. However, special attention should be placed on the amount of phylogenetic noise that the data could potentially contain. A prior exploration step using phylogenetic informativeness profiles to detect loci with sites with disproportionately high substitution rates (showing "phantom" spikes) and, if required, the ensuing filtering of the problematic data is recommended. In our dataset, filtering resulted in increased support and resolution for the shallow nodes in maximum likelihood phylogenetic trees resulting from concatenated analyses of all the loci. Additionally, it is expected that an increase in sampling (loci and taxa) will aid in resolving weakly supported, short deep internal branches.Fil: Fragoso Martínez, Itzi. Universidad Nacional Autónoma de México; México. Institute Of Biology Of Unam;Fil: Salazar, Gerardo A.. Universidad Nacional Autónoma de México; MéxicoFil: Martínez Gordillo, Martha. Universidad Nacional Autónoma de México; MéxicoFil: Magallón, Susana. Universidad Nacional Autónoma de México; MéxicoFil: Sánchez-Reyes, Luna. Universidad Nacional Autónoma de México; MéxicoFil: Moriarty Lemmon, Emily. Florida State University; Estados UnidosFil: Lemmon, Alan R.. Florida State University; Estados UnidosFil: Sazatornil, Federico David. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Córdoba. Instituto Multidisciplinario de Biología Vegetal. Universidad Nacional de Córdoba. Facultad de Ciencias Exactas Físicas y Naturales. Instituto Multidisciplinario de Biología Vegetal; ArgentinaFil: Granados Mendoza, Carolina. Instituto Potosino de Investigación Científica y Tecnológica; México. Universidad Nacional Autónoma de México; Méxic

    Are 100 enough? Inferring acanthomorph teleost phylogeny using Anchored Hybrid Enrichment

    Get PDF
    BACKGROUND: The past decade has witnessed remarkable progress towards resolution of the Tree of Life. However, despite the increased use of genomic scale datasets, some phylogenetic relationships remain difficult to resolve. Here we employ anchored phylogenomics to capture 107 nuclear loci in 29 species of acanthomorph teleost fishes, with 25 of these species sampled from the recently delimited clade Ovalentaria. Previous studies employing multilocus nuclear exon datasets have not been able to resolve the nodes at the base of the Ovalentaria tree with confidence. Here we test whether a phylogenomic approach will provide better support for these nodes, and if not, why this may be. RESULTS: After using a novel method to account for paralogous loci, we estimated phylogenies with maximum likelihood and species tree methods using DNA sequence alignments of over 80,000 base pairs. Several key relationships within Ovalentaria are well resolved, including 1) the sister taxon relationship between Cichlidae and Pholidichthys, 2) a clade containing blennies, grammas, clingfishes, and jawfishes, and 3) monophyly of Atherinomorpha (topminnows, flyingfishes, and silversides). However, many nodes in the phylogeny associated with the early diversification of Ovalentaria are poorly resolved in several analyses. Through the use of rarefaction curves we show that limited phylogenetic resolution among the earliest nodes in the Ovalentaria phylogeny does not appear to be due to a deficiency of data, as average global node support ceases to increase when only 1/3rd of the sampled loci are used in analyses. Instead this lack of resolution may be driven by model misspecification as a Bayesian mixed model analysis of the amino acid dataset provided good support for parts of the base of the Ovalentaria tree. CONCLUSIONS: Although it does not appear that the limited phylogenetic resolution among the earliest nodes in the Ovalentaria phylogeny is due to a deficiency of data, it may be that both stochastic and systematic error resulting from model misspecification play a role in the poor resolution at the base of the Ovalentaria tree as a Bayesian approach was able to resolve some of the deeper nodes, where the other methods failed. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12862-015-0415-0) contains supplementary material, which is available to authorized users

    Gill Structure Linked to Ecological and Species Diversification in a Clade of Caddisflies

    Get PDF
    Streams represent a special case of directional environmental gradients where ecological opportunity for diversification may be associated with upstream and downstream dispersal into habitats that differ in selective pressures. Temperature, current velocity and variability, sediment erosion dynamics and oxygen saturation are key environmental parameters that change in predictable ways from springs to river mouth. Many aquatic insects occupy specific longitudinal regions along these gradients, indicating a high degree of adaptation to these specific environmental conditions. In caddisflies, the evolution of tracheal gills in larval and pupal stages may be a major driver in oxygen uptake efficiency and ecological diversification. Here we study the evolution of larval gill structure in the Rhyacophila vulgaris species group using phylogenomic methods. Based on anchored hybrid enrichment, we sequenced 97 kbp of data representing 159 independent nuclear protein coding gene regions to infer the phylogeny of the R. vulgaris species group, whose species exhibit both high diversity of gill types and varied longitudinal preferences. We find that the different gill types evolved independently as derived characters in the genus and that gill structure is linked to the longitudinal habitat preference, thereby serving as a possible ecological key innovation in the R. vulgaris group

    Anchored enrichment dataset for true flies (order Diptera) reveals insights into the phylogeny of flower flies (family Syrphidae)

    Get PDF
    Background: Anchored hybrid enrichment is a form of next-generation sequencing that uses oligonucleotide probes to target conserved regions of the genome flanked by less conserved regions in order to acquire data useful for phylogenetic inference from a broad range of taxa. Once a probe kit is developed, anchored hybrid enrichment is superior to traditional PCR-based Sanger sequencing in terms of both the amount of genomic data that can be recovered and effective cost. Due to their incredibly diverse nature, importance as pollinators, and historical instability with regard to subfamilial and tribal classification, Syrphidae (flower flies or hoverflies) are an ideal candidate for anchored hybrid enrichment-based phylogenetics, especially since recent molecular phylogenies of the syrphids using only a few markers have resulted in highly unresolved topologies. Over 6200 syrphids are currently known and uncovering their phylogeny will help us to understand how these species have diversified, providing insight into an array of ecological processes, from the development of adult mimicry, the origin of adult migration, to pollination patterns and the evolution of larval resource utilization. Results: We present the first use of anchored hybrid enrichment in insect phylogenetics on a dataset containing 30 flower fly species from across all four subfamilies and 11 tribes out of 15. To produce a phylogenetic hypothesis, 559 loci were sampled to produce a final dataset containing 217,702 sites. We recovered a well resolved topology with bootstrap support values that were almost universally >95 %. The subfamily Eristalinae is recovered as paraphyletic, with the strongest support for this hypothesis to date. The ant predators in the Microdontinae are sister to all other syrphids. Syrphinae and Pipizinae are monophyletic and sister to each other. Larval predation on soft-bodied hemipterans evolved only once in this family. Conclusions: Anchored hybrid enrichment was successful in producing a robustly supported phylogenetic hypothesis for the syrphids. Subfamilial reconstruction is concordant with recent phylogenetic hypotheses, but with much higher support values. With the newly designed probe kit this analysis could be rapidly expanded with further sampling, opening the door to more comprehensive analyses targeting problem areas in syrphid phylogenetics and ecology.Peer reviewe

    Speciation in the mountains and dispersal by rivers: Molecular phylogeny of Eulamprus water skinks and the biogeography of Eastern Australia

    Get PDF
    Aim: To develop a robust phylogeny for the iconic Australian water skinks (Eulamprus) and to explore the influence of landscape evolution of eastern Australia on phylogeographic patterns. Location: Eastern and south-eastern Australia. Methods: We used Sanger methods to sequence a mitochondrial DNA (mtDNA) locus for 386 individuals across the five Eulamprus species to elucidate phylogeographic structure. We also sequenced a second mtDNA locus and four nuclear DNA (nDNA) loci for a subset of individuals to help inform our sampling strategy for next-generation sequencing. Finally, we generated an anchored hybrid enrichment (AHE) approach to sequence 378 loci for 25 individuals representing the major lineages identified in our Sanger dataset. These data were used to resolve the phylogenetic relationships among the species using coalescent-based species tree inference in *BEAST and ASTRAL. Results: The relationships between Eulamprus species were resolved with a high level of confidence using our AHE dataset. In addition, our extensive mtDNA sampling revealed substantial phylogeographic structure in all species, with the exception of the geographically highly restricted E. leuraensis. Ratios of patristic distances (mtDNA/nDNA) indicate on average a 30-fold greater distance as estimated using the mtDNA locus ND4. Main conclusions: The major divergences between lineages strongly support previously identified biogeographic barriers in eastern Australia based on studies of other taxa. These breaks appear to correlate with regions where the Great Escarpment is absent or obscure, suggesting topographic lowlands and the accompanying dry woodlands are a major barrier to dispersal for water skinks. While some river corridors, such as the Hunter Valley, were likely historically dry enough to inhibit the movement of Eulamprus populations, our data indicate that others, such as the Murray and Darling Rivers, are able to facilitate extensive gene flow through the vast arid and semi-arid lowlands of New South Wales and South Australia. Comparing the patristic distances between the mitochondrial and AHE datasets highlights the continued value in analysing both types of data.Australian Research Counci

    Off-target capture data, endosymbiont genes and morphology reveal a relict lineage that is sister to all other singing cicadas

    Get PDF
    Phylogenetic asymmetry is common throughout the tree of life and results from contrasting patterns of speciation and extinction in the paired descendant lineages of ancestral nodes. On the depauperate side of a node, we find extant ´relict´ taxa that sit atop long, unbranched lineages. Here, we show that a tiny, pale green, inconspicuous and poorly known cicada in the genus Derotettix, endemic to degraded salt-plain habitats in arid regions of central Argentina, is a relict lineage that is sister to all other modern cicadas. Nuclear and mitochondrial phylogenies of cicadas inferred from probe-based genomic hybrid capture data of both target and non-target loci and a morphological cladogram support this hypothesis. We strengthen this conclusion with genomic data from one of the cicada nutritional bacterial endosymbionts, Sulcia, an ancient and obligate endosymbiont of the larger plant-sucking bugs (Auchenorrhyncha) and an important source of maternally inherited phylogenetic data. We establish Derotettiginae subfam. nov. as a new, monogeneric, fifth cicada subfamily, and compile existing and new data on the distribution, ecology and diet of Derotettix. Our consideration of the palaeoenvironmental literature and host-plant phylogenetics allows us to predict what might have led to the relict status of Derotettix over 100 Myr of habitat change in South America.Fil: Simon, Chris. University of Connecticut; Estados UnidosFil: Gordon, Eric R. L.. University of Connecticut; Estados UnidosFil: Moulds, M.S.. Australian Museum Research Institute; AustraliaFil: Cole, Jeffrey A.. Pasadena City College; Estados UnidosFil: Haji, Diler. University of Connecticut; Estados UnidosFil: Lemmon, Alan R.. Florida State University; Estados UnidosFil: Lemmon, Emily Moriarty. Florida State University; Estados UnidosFil: Kortyna, Michelle. Florida State University; Estados UnidosFil: Nazario, Katherine. University of Connecticut; Estados UnidosFil: Wade, Elizabeth J.. Curry College. Department of Natural Sciences and Mathematics; Estados Unidos. University of Connecticut; Estados UnidosFil: Meister, Russell C.. University of Connecticut; Estados UnidosFil: Goemans, Geert. University of Connecticut; Estados UnidosFil: Chiswell, Stephen M.. National Institute of Water and Atmospheric Research; Nueva ZelandaFil: Pessacq, Pablo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Patagonia Norte. Centro de Investigación Esquel de Montaña y Estepa Patagónica. Universidad Nacional de la Patagonia "San Juan Bosco". Centro de Investigación Esquel de Montaña y Estepa Patagónica; ArgentinaFil: Veloso, Claudio. Universidad de Chile; ChileFil: McCutcheon, John P.. University of Montana; Estados UnidosFil: Lukasik, Piotr. University of Montana; Estados Unidos. Swedish Museum of Natural History. Department of Bioinformatics and Genetics; Sueci
    corecore