18 research outputs found
Systematic Error in Seed Plant Phylogenomics
Resolving the closest relatives of Gnetales has been an enigmatic problem in seed plant phylogeny. The problem is known to be difficult because of the extent of divergence between this diverse group of gymnosperms and their closest phylogenetic relatives. Here, we investigate the evolutionary properties of conifer chloroplast DNA sequences. To improve taxon sampling of Cupressophyta (non-Pinaceae conifers), we report sequences from three new chloroplast (cp) genomes of Southern Hemisphere conifers. We have applied a site pattern sorting criterion to study compositional heterogeneity, heterotachy, and the fit of conifer chloroplast genome sequences to a general time reversible + G substitution model. We show that non-time reversible properties of aligned sequence positions in the chloroplast genomes of Gnetales mislead phylogenetic reconstruction of these seed plants. When 2,250 of the most varied sites in our concatenated alignment are excluded, phylogenetic analyses favor a close evolutionary relationship between the Gnetales and Pinaceae—the Gnepine hypothesis. Our analytical protocol provides a useful approach for evaluating the robustness of phylogenomic inferences. Our findings highlight the importance of goodness of fit between substitution model and data for understanding seed plant phylogeny
Data from: Phylogenetic analysis of 47 chloroplast genomes clarifies the contribution of wild species to the domesticated apple maternal line
Both the origin of domesticated apple and the overall phylogeny of the genus Malus are still not completely resolved. Having this as a target, we built a 134,553 position long alignment including two previously published cpDNAs and 45 de novo sequenced, fully co-linear chloroplast genomes from cultivated apple varieties and wild apple species. The data produced are free from compositional heterogeneity and from substitutional saturation, which can adversely affect phylogeny reconstruction. Phylogenetic analyses based on this alignment recovered a branch, having the maximum bootstrap support, subtending a large group of the cultivated apple sorts together with all analyzed European wild apple (Malus sylvestris) accessions. One apple cultivar was embedded in a monophylum comprising wild M. sieversii accessions and other Asian apple species. The data demonstrate that M. sylvestris has contributed chloroplast genome to a substantial fraction of domesticated apple varieties, supporting the conclusion that different wild species should have contributed the organelle and nuclear genomes to domesticated apple
Data from: The root of flowering plants and total evidence
Support for Amborella as the sole survivor of an evolutionary lineage that is sister to all other angiosperms comes from positions in DNA multiple-sequence alignments that have a poor fit to time-reversible substitution models. These sites exhibit significant levels of homoplasy, compositional heterogeneity, and strong heterotachy. We report phylogenetic analyses with observed, randomized, and simulated data which show there is little or no expectation that these sites provide useful information for understanding relationships among basal angiosperms. Their inclusion in phylogenetic analyses leads to a long-branch attraction artifact that favors Amborella as sister to other angiosperms in reconstructed phylogenies. Using parametric simulations, we show that sites in chloroplast sequences that exhibit less homoplasy between angiosperms and gymnosperms provide more reliable information for inferring basal angiosperm relationships. We confirm our earlier findings that the basal angiosperm Amborella is most closely related to aquatic herbs. Our current and previously reported (Goremykin et al. 2013) analyses highlight an essential aspect of the total evidence approach to phylogenetic inference. They suggest that data partitioning aimed at identifying components of the data that better fit evolutionary models is a more reliable approach to phylogeny reconstruction at deep taxonomic levels
S1
Gapped alignment of 40553 positions in length used to produce the OV alignmen
Data from: The evolutionary root of flowering plants
Correct rooting of the angiosperm radiation is both challenging and necessary for understanding the origins and evolution of physiological and phenotypic traits in flowering plants. The problem is known to be difficult due to the large genetic distance separating flowering plants from other seed plants and the sparse taxon sampling among basal angiosperms. Here we provide further evidence for concern over substitution model misspecification in analyses of chloroplast DNA sequences. We show that support for Amborella as the sole representative of the most basal angiosperm lineage is founded on sequence site patterns poorly described by time reversible substitution models. Improving the fit between sequence data and substitution model identifies Trithuria, Nymphaeales and Amborella as surviving relatives of the most basal lineage of flowering plants. This finding indicates that aquatic and herbaceous species dominate the earliest extant lineage of flowering plants
S6 - S6.zip
zip archive containing scripts used in the pape
S2 - S2.txt
alignment of first and the second codon positions 25246 positions in lengt
S4 - S4.txt
31674 positions long alignment, maintaining the reading fram
S5 - S5.pdf
NNet split networks for S1 and S4 alignment