76 research outputs found

    Methods for Obtaining and Analyzing Whole Chloroplast Genome Sequences

    Get PDF
    During the past decade there has been a rapid increase in our understanding of plastid genome organization and evolution due to the availability of many new completely sequenced genomes. Currently there are 43 complete genomes published and ongoing projects are likely to increase this sampling to nearly 200 genomes during the next five years. Several groups of researchers including ours have been developing new techniques for gathering and analyzing entire plastid genome sequences and details of these developments are summarized in this chapter. The most important recent developments that enhance our ability to generate whole chloroplast genome sequences involve the generation of pure fractions of chloroplast genomes by whole genome amplification using rolling circular amplification, cloning genomes into Fosmid or BAC vectors, and the development of an organellar annotation program (DOGMA). In addition to providing details of these methods, we provide an overview of methods for analyzing complete plastid genome sequences for repeats and gene content, as well as approaches for using gene order and sequence data for phylogeny reconstruction. This explosive increase in the number of sequenced plastid genomes and improved computational tools will provide many insights into the evolution of these genomes and much new data for assessing relationships at deep nodes in plants and other photosynthetic organisms

    A universal probe set for targeted sequencing of 353 nuclear genes from any flowering plant designed using k-medoids clustering

    Get PDF
    Sequencing of target-enriched libraries is an efficient and cost-effective method for obtaining DNA sequence data from hundreds of nuclear loci for phylogeny reconstruction. Much of the cost of developing targeted sequencing approaches is associated with the generation of preliminary data needed for the identification of orthologous loci for probe design. In plants, identifying orthologous loci has proven difficult due to a large number of whole-genome duplication events, especially in the angiosperms (flowering plants).We used multiple sequence alignments from over 600 angiosperms for 353 putatively single-copy protein-coding genes identified by the One Thousand Plant Transcriptomes Initiative to design a set of targeted sequencing probes for phylogenetic studies of any angiosperm group. To maximize the phylogenetic potential of the probes, while minimizing the cost of production, we introduce a k-medoids clustering approach to identify the minimum number of sequences necessary to represent each coding sequence in the final probe set. Using this method, 5–15 representative sequences were selected per orthologous locus, representing the sequence diversity of angiosperms more efficiently than if probes were designed using available sequenced genomes alone. To test our approximately 80,000 probes, we hybridized libraries from 42 species spanning all higher-order groups of angiosperms, with a focus on taxa not present in the sequence alignments used to design the probes. Out of a possible 353 coding sequences, we recovered an average of 283 per species and at least 100 in all species. Differences among taxa in sequence recovery could not be explained by relatedness to the representative taxa selected for probe design, suggesting that there is no phylogenetic bias in the probe set. Our probe set, which targeted 260 kbp of coding sequence, achieved a median recovery of 137 kbp per taxon in coding regions, a maximum recovery of 250 kbp, and an additional median of 212 kbp per taxon in flanking non-coding regions across all species. These results suggest that the Angiosperms353 probe set described here is effective for any group of flowering plants and would be useful for phylogenetic studies from the species level to higher-order groups, including the entire angiosperm clade itself

    Floral gene resources from basal angiosperms for comparative genomics research

    Get PDF
    BACKGROUND: The Floral Genome Project was initiated to bridge the genomic gap between the most broadly studied plant model systems. Arabidopsis and rice, although now completely sequenced and under intensive comparative genomic investigation, are separated by at least 125 million years of evolutionary time, and cannot in isolation provide a comprehensive perspective on structural and functional aspects of flowering plant genome dynamics. Here we discuss new genomic resources available to the scientific community, comprising cDNA libraries and Expressed Sequence Tag (EST) sequences for a suite of phylogenetically basal angiosperms specifically selected to bridge the evolutionary gaps between model plants and provide insights into gene content and genome structure in the earliest flowering plants. RESULTS: Random sequencing of cDNAs from representatives of phylogenetically important eudicot, non-grass monocot, and gymnosperm lineages has so far (as of 12/1/04) generated 70,514 ESTs and 48,170 assembled unigenes. Efficient sorting of EST sequences into putative gene families based on whole Arabidopsis/rice proteome comparison has permitted ready identification of cDNA clones for finished sequencing. Preliminarily, (i) proportions of functional categories among sequenced floral genes seem representative of the entire Arabidopsis transcriptome, (ii) many known floral gene homologues have been captured, and (iii) phylogenetic analyses of ESTs are providing new insights into the process of gene family evolution in relation to the origin and diversification of the angiosperms. CONCLUSION: Initial comparisons illustrate the utility of the EST data sets toward discovery of the basic floral transcriptome. These first findings also afford the opportunity to address a number of conspicuous evolutionary genomic questions, including reproductive organ transcriptome overlap between angiosperms and gymnosperms, genome-wide duplication history, lineage-specific gene duplication and functional divergence, and analyses of adaptive molecular evolution. Since not all genes in the floral transcriptome will be associated with flowering, these EST resources will also be of interest to plant scientists working on other functions, such as photosynthesis, signal transduction, and metabolic pathways

    Evolution of chloroplast retrograde signaling facilitates green plant adaptation to land

    Get PDF
    Chloroplast retrograde signaling networks are vital for chloroplast biogenesis, operation, and signaling, including excess light and drought stress signaling. To date, retrograde signaling has been considered in the context of land plant adaptation, but not regarding the origin and evolution of signaling cascades linking chloroplast function to stomatal regulation. We show that key elements of the chloroplast retrograde signaling process, the nucleotide phosphatase (SAL1) and 3'-phosphoadenosine-5'-phosphate (PAP) metabolism, evolved in streptophyte algae-the algal ancestors of land plants. We discover an early evolution of SAL1-PAP chloroplast retrograde signaling in stomatal regulation based on conserved gene and protein structure, function, and enzyme activity and transit peptides of SAL1s in species including flowering plants, the fern Ceratopteris richardii, and the moss Physcomitrella patens Moreover, we demonstrate that PAP regulates stomatal closure via secondary messengers and ion transport in guard cells of these diverse lineages. The origin of stomata facilitated gas exchange in the earliest land plants. Our findings suggest that the conquest of land by plants was enabled by rapid response to drought stress through the deployment of an ancestral SAL1-PAP signaling pathway, intersecting with the core abscisic acid signaling in stomatal guard cells

    A physical map for the Amborella trichopoda genome sheds light on the evolution of angiosperm genome structure

    Get PDF
    Background: Recent phylogenetic analyses have identified Amborella trichopoda, an understory tree species endemic to the forests of New Caledonia, as sister to a clade including all other known flowering plant species. The Amborella genome is a unique reference for understanding the evolution of angiosperm genomes because it can serve as an outgroup to root comparative analyses. A physical map, BAC end sequences and sample shotgun sequences provide a first view of the 870 Mbp Amborella genome.Results: Analysis of Amborella BAC ends sequenced from each contig suggests that the density of long terminal repeat retrotransposons is negatively correlated with that of protein coding genes. Syntenic, presumably ancestral, gene blocks were identified in comparisons of the Amborella BAC contigs and the sequenced Arabidopsis thaliana, Populus trichocarpa, Vitis vinifera and Oryza sativa genomes. Parsimony mapping of the loss of synteny corroborates previous analyses suggesting that the rate of structural change has been more rapid on lineages leading to Arabidopsis and Oryza compared with lineages leading to Populus and Vitis. The gamma paleohexiploidy event identified in the Arabidopsis, Populus and Vitis genomes is shown to have occurred after the divergence of all other known angiosperms from the lineage leading to Amborella.Conclusions: When placed in the context of a physical map, BAC end sequences representing just 5.4% of the Amborella genome have facilitated reconstruction of gene blocks that existed in the last common ancestor of all flowering plants. The Amborella genome is an invaluable reference for inferences concerning the ancestral angiosperm and subsequent genome evolution

    The Kalanchoe genome provides insights into convergent evolution and building blocks of crassulacean acid metabolism

    Get PDF
    Crassulacean acid metabolism (CAM) is a water-use efficient adaptation of photosynthesis that has evolved independently many times in diverse lineages of flowering plants. We hypothesize that convergent evolution of protein sequence and temporal gene expression underpins the independent emergences of CAM from C3 photosynthesis. To test this hypothesis, we generate a de novo genome assembly and genome-wide transcript expression data for Kalanchoë fedtschenkoi, an obligate CAM species within the core eudicots with a relatively small genome (~260 Mb). Our comparative analyses identify signatures of convergence in protein sequence and re-scheduling of diel transcript expression of genes involved in nocturnal CO2 fixation, stomatal movement, heat tolerance, circadian clock, and carbohydrate metabolism in K. fedtschenkoi and other CAM species in comparison with non-CAM species. These findings provide new insights into molecular convergence and building blocks of CAM and will facilitate CAM-into-C3 photosynthesis engineering to enhance water-use efficiency in crops

    Dynamic genome evolution in a model fern

    Get PDF
    The large size and complexity of most fern genomes have hampered efforts to elucidate fundamental aspects of fern biology and land plant evolution through genome-enabled research. Here we present a chromosomal genome assembly and associated methylome, transcriptome and metabolome analyses for the model fern species Ceratopteris richardii. The assembly reveals a history of remarkably dynamic genome evolution including rapid changes in genome content and structure following the most recent whole-genome duplication approximately 60 million years ago. These changes include massive gene loss, rampant tandem duplications and multiple horizontal gene transfers from bacteria, contributing to the diversification of defence-related gene families. The insertion of transposable elements into introns has led to the large size of the Ceratopteris genome and to exceptionally long genes relative to other plants. Gene family analyses indicate that genes directing seed development were co-opted from those controlling the development of fern sporangia, providing insights into seed plant evolution. Our findings and annotated genome assembly extend the utility of Ceratopteris as a model for investigating and teaching plant biology
    • …
    corecore