108 research outputs found
Hyb-Seq for flowering plant systematics
High-throughput DNA sequencing (HTS) presents great opportunities for plant systematics, yet genomic complexity needs to be reduced for HTS to be effectively applied. We highlight Hyb-Seq as a promising approach, especially in light of the recent development of probes enriching 353 low-copy nuclear genes from any flowering plant taxon
Paralogs and off-target sequences improve phylogenetic resolution in a densely-sampled study of the breadfruit genus (Artocarpus, Moraceae)
We present a 517-gene phylogenetic framework for the breadfruit genus Artocarpus (ca. 70 spp., Moraceae), making use of silica-dried leaves from recent fieldwork and herbarium specimens (some up to 106 years old) to achieve 96% taxon sampling. We explore issues relating to assembly, paralogous loci, partitions, and analysis method to reconstruct a phylogeny that is robust to variation in data and available tools. Although codon partitioning did not result in any substantial topological differences, the inclusion of flanking noncoding sequence in analyses significantly increased the resolution of gene trees. We also found that increasing the size of data sets increased convergence between analysis methods but did not reduce gene-tree conflict. We optimized the HybPiper targeted-enrichment sequence assembly pipeline for short sequences derived from degraded DNA extracted from museum specimens. Although the subgenera of Artocarpus were monophyletic, revision is required at finer scales, particularly with respect to widespread species. We expect our results to provide a basis for further studies in Artocarpus and provide guidelines for future analyses of data sets based on target enrichment data, particularly those using sequences from both fresh and museum material, counseling careful attention to the potential of off-target sequences to improve resolution. [Artocarpus; Moraceae; noncoding sequences; phylogenomics; target enrichment.
Characterization of the basal angiosperm Aristolochia fimbriata: a potential experimental system for genetic studies
BACKGROUND: Previous studies in basal angiosperms have provided insight into the diversity within the angiosperm lineage and helped to polarize analyses of flowering plant evolution. However, there is still not an experimental system for genetic studies among basal angiosperms to facilitate comparative studies and functional investigation. It would be desirable to identify a basal angiosperm experimental system that possesses many of the features found in existing plant model systems (e.g., Arabidopsis and Oryza). RESULTS: We have considered all basal angiosperm families for general characteristics important for experimental systems, including availability to the scientific community, growth habit, and membership in a large basal angiosperm group that displays a wide spectrum of phenotypic diversity. Most basal angiosperms are woody or aquatic, thus are not well-suited for large scale cultivation, and were excluded. We further investigated members of Aristolochiaceae for ease of culture, life cycle, genome size, and chromosome number. We demonstrated self-compatibility for Aristolochia elegans and A. fimbriata, and transformation with a GFP reporter construct for Saruma henryi and A. fimbriata. Furthermore, A. fimbriata was easily cultivated with a life cycle of just three months, could be regenerated in a tissue culture system, and had one of the smallest genomes among basal angiosperms. An extensive multi-tissue EST dataset was produced for A. fimbriata that includes over 3.8 million 454 sequence reads. CONCLUSIONS: Aristolochia fimbriata has numerous features that facilitate genetic studies and is suggested as a potential model system for use with a wide variety of technologies. Emerging genetic and genomic tools for A. fimbriata and closely related species can aid the investigation of floral biology, developmental genetics, biochemical pathways important in plant-insect interactions as well as human health, and various other features present in early angiosperms
Phylogenomic analysis of transcriptome data elucidates co-occurrence of a paleopolyploid event and the origin of bimodal karyotypes in Agavoideae (Asparagaceae)
Premise of the study: The stability of the bimodal karyotype found in Agave and closely related species has long interested botanists. The origin of the bimodal karyotype has been attributed to allopolyploidy, but this hypothesis has not been tested. Next-generation transcriptome sequence data were used to test whether a paleopolyploid event occurred on the same branch of the Agavoideae phylogenetic tree as the origin of the Yucca-Agave bimodal karyotype. Methods: Illumina RNA-seq data were generated for phylogenetically strategic species in Agavoideae. Paleopolyploidy was inferred in analyses of frequency plots for synonymous substitutions per synonymous site (K-s) between Hosta, Agave, and Chlorophytum paralogous and orthologous gene pairs. Phylogenies of gene families including paralogous genes for these species and outgroup species were estimated to place inferred paleopolyploid events on a species tree. Key results: K-s frequency plots suggested paleopolyploid events in the history of the genera Agave, Hosta, and Chlorophytum. Phylogenetic analyses of gene families estimated from transcriptome data revealed two polyploid events: one predating the last common ancestor of Agave and Hosta and one within the lineage leading to Chlorophytum. Conclusions: We found that polyploidy and the origin of the Yucca-Agave bimodal karyotype co-occur on the same lineage consistent with the hypothesis that the bimodal karyotype is a consequence of allopolyploidy. We discuss this and alternative mechanisms for the formation of the Yucca-Agave bimodal karyotype. More generally, we illustrate how the use of next-generation sequencing technology is a cost-efficient means for assessing genome evolution in nonmodel species
A universal probe set for targeted sequencing of 353 nuclear genes from any flowering plant designed using k-medoids clustering
Sequencing of target-enriched libraries is an efficient and cost-effective method for obtaining DNA sequence data from hundreds of nuclear loci for phylogeny reconstruction. Much of the cost of developing targeted sequencing approaches is associated with the generation of preliminary data needed for the identification of orthologous loci for probe design. In plants, identifying orthologous loci has proven difficult due to a large number of whole-genome duplication events, especially in the angiosperms (flowering plants).We used multiple sequence alignments from over 600 angiosperms for 353 putatively single-copy protein-coding genes identified by the One Thousand Plant Transcriptomes Initiative to design a set of targeted sequencing probes for phylogenetic studies of any angiosperm group. To maximize the phylogenetic potential of the probes, while minimizing the cost of production, we introduce a k-medoids clustering approach to identify the minimum number of sequences necessary to represent each coding sequence in the final probe set. Using this method, 5–15 representative sequences were selected per orthologous locus, representing the sequence diversity of angiosperms more efficiently than if probes were designed using available sequenced genomes alone. To test our approximately 80,000 probes, we hybridized libraries from 42 species spanning all higher-order groups of angiosperms, with a focus on taxa not present in the sequence alignments used to design the probes. Out of a possible 353 coding sequences, we recovered an average of 283 per species and at least 100 in all species. Differences among taxa in sequence recovery could not be explained by relatedness to the representative taxa selected for probe design, suggesting that there is no phylogenetic bias in the probe set. Our probe set, which targeted 260 kbp of coding sequence, achieved a median recovery of 137 kbp per taxon in coding regions, a maximum recovery of 250 kbp, and an additional median of 212 kbp per taxon in flanking non-coding regions across all species. These results suggest that the Angiosperms353 probe set described here is effective for any group of flowering plants and would be useful for phylogenetic studies from the species level to higher-order groups, including the entire angiosperm clade itself
Functional genomics of a generalist parasitic plant: Laser microdissection of host-parasite interface reveals host-specific patterns of parasite gene expression
Abstract Background Orobanchaceae is the only plant family with members representing the full range of parasitic lifestyles plus a free-living lineage sister to all parasitic lineages, Lindenbergia. A generalist member of this family, and an important parasitic plant model, Triphysaria versicolor regularly feeds upon a wide range of host plants. Here, we compare de novo assembled transcriptomes generated from laser micro-dissected tissues at the host-parasite interface to uncover details of the largely uncharacterized interaction between parasitic plants and their hosts. Results The interaction of Triphysaria with the distantly related hosts Zea mays and Medicago truncatula reveals dramatic host-specific gene expression patterns. Relative to above ground tissues, gene families are disproportionally represented at the interface including enrichment for transcription factors and genes of unknown function. Quantitative Real-Time PCR of a T. versicolor β-expansin shows strong differential (120x) upregulation in response to the monocot host Z. mays; a result that is concordant with our read count estimates. Pathogenesis-related proteins, other cell wall modifying enzymes, and orthologs of genes with unknown function (annotated as such in sequenced plant genomes) are among the parasite genes highly expressed by T. versicolor at the parasite-host interface. Conclusions Laser capture microdissection makes it possible to sample the small region of cells at the epicenter of parasite host interactions. The results of our analysis suggest that T. versicolor’s generalist strategy involves a reliance on overlapping but distinct gene sets, depending upon the host plant it is parasitizing. The massive upregulation of a T. versicolor β-expansin is suggestive of a mechanism for parasite success on grass hosts. In this preliminary study of the interface transcriptomes, we have shown that T. versicolor, and the Orobanchaceae in general, provide excellent opportunities for the characterization of plant genes with unknown functions
Phylogenomics and the rise of the angiosperms
Angiosperms are the cornerstone of most terrestrial ecosystems and human livelihoods1, 2. A robust understanding of angiosperm evolution is required to explain their rise to ecological dominance. So far, the angiosperm tree of life has been determined primarily by means of analyses of the plastid genome3, 4. Many studies have drawn on this foundational work, such as classification and first insights into angiosperm diversification since their Mesozoic origins5–7. However, the limited and biased sampling of both taxa and genomes undermines confidence in the tree and its implications. Here, we build the tree of life for almost 8,000 (about 60%) angiosperm genera using a standardized set of 353 nuclear genes8. This 15-fold increase in genus-level sampling relative to comparable nuclear studies9 provides a critical test of earlier results and brings notable change to key groups, especially in rosids, while substantiating many previously predicted relationships. Scaling this tree to time using 200 fossils, we discovered that early angiosperm evolution was characterized by high gene tree conflict and explosive diversification, giving rise to more than 80% of extant angiosperm orders. Steady diversification ensued through the remaining Mesozoic Era until rates resurged in the Cenozoic Era, concurrent with decreasing global temperatures and tightly linked with gene tree conflict. Taken together, our extensive sampling combined with advanced phylogenomic methods shows the deep history and full complexity in the evolution of a megadiverse clade
A genome triplication associated with early diversification of the core eudicots
Background: Although it is agreed that a major polyploidy event, gamma, occurred within the eudicots, the phylogenetic placement of the event remains unclear. Results: To determine when this polyploidization occurred relative to speciation events in angiosperm history, we employed a phylogenomic approach to investigate the timing of gene set duplications located on syntenic gamma blocks. We populated 769 putative gene families with large sets of homologs obtained from public transcriptomes of basal angiosperms, magnoliids, asterids, and more than 91.8 gigabases of new next-generation transcriptome sequences of non-grass monocots and basal eudicots. The overwhelming majority (95%) of well-resolved gamma duplications was placed before the separation of rosids and asterids and after the split of monocots and eudicots, providing strong evidence that the gamma polyploidy event occurred early in eudicot evolution. Further, the majority of gene duplications was placed after the divergence of the Ranunculales and core eudicots, indicating that the gamma appears to be restricted to core eudicots. Molecular dating estimates indicate that the duplication events were intensely concentrated around 117 million years ago. Conclusions: The rapid radiation of core eudicot lineages that gave rise to nearly 75% of angiosperm species appears to have occurred coincidentally or shortly following the gamma triplication event. Reconciliation of gene trees with a species phylogeny can elucidate the timing of major events in genome evolution, even when genome sequences are only available for a subset of species represented in the gene trees. Comprehensive transcriptome datasets are valuable complements to genome sequences for high-resolution phylogenomic analysis
Parasitic Plants Striga and Phelipanche Dependent upon Exogenous Strigolactones for Germination Have Retained Genes for Strigolactone Biosynthesis
Abstract Strigolactones are plant hormones with multiple functions, including regulating various aspects of plant architecture such as shoot branching, facilitating the colonization of plant roots by arbuscular mycorrhizal fungi, and acting as seed germination stimulants for certain parasitic plants of the family Orobanchaceae. The obligate parasitic species Phelipanche aegyptiaca and Striga hermonthica require strigolactones for germination, while the facultative parasite Triphysaria versicolor does not. It has been hypothesized that P. aegyptiaca and S. hermonthica would have undergone evolutionary loss of strigolactone biosynthesis as a part of their mechanism to enable specific detection of exogenous strigolactones. We analyzed the transcriptomes of P. aegyptiaca, S. hermonthica and T. versicolor and identified genes known to act in strigolactone synthesis (D27, CCD7, CCD8, and MAX1), perception (MAX2 and D14) and transport (PDR12). These genes were then analyzed to assess likelihood of function. Transcripts of all strigolactone-related genes were found M. Das et al. 1152 in P. aegyptiaca and S. hermonthica, and evidence points to their encoding functional proteins. Gene open reading frames were consistent with homologs from Arabidopsis and other strigolactone-producing plants, and all genes were expressed in parasite tissues. In general, the genes related to strigolactone synthesis and perception appeared to be evolving under codon-based selective constraints in strigolactone-dependent species. Bioassays of S. hermonthica root extracts indicated the presence of strigolactone class stimulants on germination of P. aegyptiaca seeds. Taken together, these results indicate that Phelipanche aegyptiaca and S. hermonthica have retained functional genes involved in strigolactone biosynthesis, suggesting that the parasites use both endogenous and exogenous strigolactones and have mechanisms to differentiate the two
Data access for the 1,000 Plants (1KP) project
© 2014 Matasci et al.; licensee BioMed Central Ltd. The 1,000 plants (1KP) project is an international multi-disciplinary consortium that has generated transcriptome data from over 1,000 plant species, with exemplars for all of the major lineages across the Viridiplantae (green plants) clade. Here, we describe how to access the data used in a phylogenomics analysis of the first 85 species, and how to visualize our gene and species trees. Users can develop computational pipelines to analyse these data, in conjunction with data of their own that they can upload. Computationally estimated protein-protein interactions and biochemical pathways can be visualized at another site. Finally, we comment on our future plans and how they fit within this scalable system for the dissemination, visualization, and analysis of large multi-species data sets
- …