24 research outputs found

    Identification of shared single copy nuclear genes in Arabidopsis, Populus, Vitis and Oryza and their phylogenetic utility across various taxonomic levels

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Although the overwhelming majority of genes found in angiosperms are members of gene families, and both gene- and genome-duplication are pervasive forces in plant genomes, some genes are sufficiently distinct from all other genes in a genome that they can be operationally defined as 'single copy'. Using the gene clustering algorithm MCL-tribe, we have identified a set of 959 single copy genes that are shared single copy genes in the genomes of <it>Arabidopsis thaliana, Populus trichocarpa, Vitis vinifera </it>and <it>Oryza sativa</it>. To characterize these genes, we have performed a number of analyses examining GO annotations, coding sequence length, number of exons, number of domains, presence in distant lineages, such as <it>Selaginella </it>and <it>Physcomitrella</it>, and phylogenetic analysis to estimate copy number in other seed plants and to demonstrate their phylogenetic utility. We then provide examples of how these genes may be used in phylogenetic analyses to reconstruct organismal history, both by using extant coverage in EST databases for seed plants and <it>de novo </it>amplification via RT-PCR in the family Brassicaceae.</p> <p>Results</p> <p>There are 959 single copy nuclear genes shared in <it>Arabidopsis</it>, <it>Populus</it>, <it>Vitis </it>and <it>Oryza </it>["APVO SSC genes"]. The majority of these genes are also present in the <it>Selaginella </it>and <it>Physcomitrella </it>genomes. Public EST sets for 197 species suggest that most of these genes are present across a diverse collection of seed plants, and appear to exist as single or very low copy genes, though exceptions are seen in recently polyploid taxa and in lineages where there is significant evidence for a shared large-scale duplication event. Genes encoding proteins localized in organelles are more commonly single copy than expected by chance, but the evolutionary forces responsible for this bias are unknown.</p> <p>Regardless of the evolutionary mechanisms responsible for the large number of shared single copy genes in diverse flowering plant lineages, these genes are valuable for phylogenetic and comparative analyses. Eighteen of the APVO SSC single copy genes were amplified in the Brassicaceae using RT-PCR and directly sequenced. Alignments of these sequences provide improved resolution of Brassicaceae phylogeny compared to recent studies using plastid and ITS sequences. An analysis of sequences from 13 APVO SSC genes from 69 species of seed plants, derived mainly from public EST databases, yielded a phylogeny that was largely congruent with prior hypotheses based on multiple plastid sequences. Whereas single gene phylogenies that rely on EST sequences have limited bootstrap support as the result of limited sequence information, concatenated alignments result in phylogenetic trees with strong bootstrap support for already established relationships. Overall, these single copy nuclear genes are promising markers for phylogenetics, and contain a greater proportion of phylogenetically-informative sites than commonly used protein-coding sequences from the plastid or mitochondrial genomes.</p> <p>Conclusions</p> <p>Putatively orthologous, shared single copy nuclear genes provide a vast source of new evidence for plant phylogenetics, genome mapping, and other applications, as well as a substantial class of genes for which functional characterization is needed. Preliminary evidence indicates that many of the shared single copy nuclear genes identified in this study may be well suited as markers for addressing phylogenetic hypotheses at a variety of taxonomic levels.</p

    Comparison of next generation sequencing technologies for transcriptome characterization

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>We have developed a simulation approach to help determine the optimal mixture of sequencing methods for most complete and cost effective transcriptome sequencing. We compared simulation results for traditional capillary sequencing with "Next Generation" (NG) ultra high-throughput technologies. The simulation model was parameterized using mappings of 130,000 cDNA sequence reads to the <it>Arabidopsis </it>genome (NCBI Accession SRA008180.19). We also generated 454-GS20 sequences and <it>de novo </it>assemblies for the basal eudicot California poppy (<it>Eschscholzia californica</it>) and the magnoliid avocado (<it>Persea americana</it>) using a variety of methods for cDNA synthesis.</p> <p>Results</p> <p>The <it>Arabidopsis </it>reads tagged more than 15,000 genes, including new splice variants and extended UTR regions. Of the total 134,791 reads (13.8 MB), 119,518 (88.7%) mapped exactly to known exons, while 1,117 (0.8%) mapped to introns, 11,524 (8.6%) spanned annotated intron/exon boundaries, and 3,066 (2.3%) extended beyond the end of annotated UTRs. Sequence-based inference of relative gene expression levels correlated significantly with microarray data. As expected, NG sequencing of normalized libraries tagged more genes than non-normalized libraries, although non-normalized libraries yielded more full-length cDNA sequences. The <it>Arabidopsis </it>data were used to simulate additional rounds of NG and traditional EST sequencing, and various combinations of each. Our simulations suggest a combination of FLX and Solexa sequencing for optimal transcriptome coverage at modest cost. We have also developed ESTcalc <url>http://fgp.huck.psu.edu/NG_Sims/ngsim.pl</url>, an online webtool, which allows users to explore the results of this study by specifying individualized costs and sequencing characteristics.</p> <p>Conclusion</p> <p>NG sequencing technologies are a highly flexible set of platforms that can be scaled to suit different project goals. In terms of sequence coverage alone, the NG sequencing is a dramatic advance over capillary-based sequencing, but NG sequencing also presents significant challenges in assembly and sequence accuracy due to short read lengths, method-specific sequencing errors, and the absence of physical clones. These problems may be overcome by hybrid sequencing strategies using a mixture of sequencing methodologies, by new assemblers, and by sequencing more deeply. Sequencing and microarray outcomes from multiple experiments suggest that our simulator will be useful for guiding NG transcriptome sequencing projects in a wide range of organisms.</p

    Characterization of the basal angiosperm Aristolochia fimbriata: a potential experimental system for genetic studies

    Get PDF
    BACKGROUND: Previous studies in basal angiosperms have provided insight into the diversity within the angiosperm lineage and helped to polarize analyses of flowering plant evolution. However, there is still not an experimental system for genetic studies among basal angiosperms to facilitate comparative studies and functional investigation. It would be desirable to identify a basal angiosperm experimental system that possesses many of the features found in existing plant model systems (e.g., Arabidopsis and Oryza). RESULTS: We have considered all basal angiosperm families for general characteristics important for experimental systems, including availability to the scientific community, growth habit, and membership in a large basal angiosperm group that displays a wide spectrum of phenotypic diversity. Most basal angiosperms are woody or aquatic, thus are not well-suited for large scale cultivation, and were excluded. We further investigated members of Aristolochiaceae for ease of culture, life cycle, genome size, and chromosome number. We demonstrated self-compatibility for Aristolochia elegans and A. fimbriata, and transformation with a GFP reporter construct for Saruma henryi and A. fimbriata. Furthermore, A. fimbriata was easily cultivated with a life cycle of just three months, could be regenerated in a tissue culture system, and had one of the smallest genomes among basal angiosperms. An extensive multi-tissue EST dataset was produced for A. fimbriata that includes over 3.8 million 454 sequence reads. CONCLUSIONS: Aristolochia fimbriata has numerous features that facilitate genetic studies and is suggested as a potential model system for use with a wide variety of technologies. Emerging genetic and genomic tools for A. fimbriata and closely related species can aid the investigation of floral biology, developmental genetics, biochemical pathways important in plant-insect interactions as well as human health, and various other features present in early angiosperms

    Floral gene resources from basal angiosperms for comparative genomics research

    Get PDF
    BACKGROUND: The Floral Genome Project was initiated to bridge the genomic gap between the most broadly studied plant model systems. Arabidopsis and rice, although now completely sequenced and under intensive comparative genomic investigation, are separated by at least 125 million years of evolutionary time, and cannot in isolation provide a comprehensive perspective on structural and functional aspects of flowering plant genome dynamics. Here we discuss new genomic resources available to the scientific community, comprising cDNA libraries and Expressed Sequence Tag (EST) sequences for a suite of phylogenetically basal angiosperms specifically selected to bridge the evolutionary gaps between model plants and provide insights into gene content and genome structure in the earliest flowering plants. RESULTS: Random sequencing of cDNAs from representatives of phylogenetically important eudicot, non-grass monocot, and gymnosperm lineages has so far (as of 12/1/04) generated 70,514 ESTs and 48,170 assembled unigenes. Efficient sorting of EST sequences into putative gene families based on whole Arabidopsis/rice proteome comparison has permitted ready identification of cDNA clones for finished sequencing. Preliminarily, (i) proportions of functional categories among sequenced floral genes seem representative of the entire Arabidopsis transcriptome, (ii) many known floral gene homologues have been captured, and (iii) phylogenetic analyses of ESTs are providing new insights into the process of gene family evolution in relation to the origin and diversification of the angiosperms. CONCLUSION: Initial comparisons illustrate the utility of the EST data sets toward discovery of the basic floral transcriptome. These first findings also afford the opportunity to address a number of conspicuous evolutionary genomic questions, including reproductive organ transcriptome overlap between angiosperms and gymnosperms, genome-wide duplication history, lineage-specific gene duplication and functional divergence, and analyses of adaptive molecular evolution. Since not all genes in the floral transcriptome will be associated with flowering, these EST resources will also be of interest to plant scientists working on other functions, such as photosynthesis, signal transduction, and metabolic pathways

    Article Comparative Transcriptome Analyses Reveal Core Parasitism Genes and Suggest Gene Duplication and Repurposing as Sources of Structural Novelty

    Get PDF
    Abstract The origin of novel traits is recognized as an important process underlying many major evolutionary radiations. We studied the genetic basis for the evolution of haustoria, the novel feeding organs of parasitic flowering plants, using comparative transcriptome sequencing in three species of Orobanchaceae. Around 180 genes are upregulated during haustorial development following host attachment in at least two species, and these are enriched in proteases, cell wall modifying enzymes, and extracellular secretion proteins. Additionally, about 100 shared genes are upregulated in response to haustorium inducing factors prior to host attachment. Collectively, we refer to these newly identified genes as putative &quot;parasitism genes.&quot; Most of these parasitism genes are derived from gene duplications in a common ancestor of Orobanchaceae and Mimulus guttatus, a related nonparasitic plant. Additionally, the signature of relaxed purifying selection and/or adaptive evolution at specific sites was detected in many haustorial genes, and may play an important role in parasite evolution. Comparative analysis of gene expression patterns in parasitic and nonparasitic angiosperms suggests that parasitism genes are derived primarily from root and floral tissues, but with some genes co-opted from other tissues. Gene duplication, often taking place in a nonparasitic ancestor of Orobanchaceae, followed by regulatory neofunctionalization, was an important process in the origin of parasitic haustoria

    Glucocorticoid receptor-regulated TcLEC2 expression triggers somatic embryogenesis in Theobroma cacao leaf tissue.

    No full text
    Theobroma cacao, the source of cocoa, is a crop of particular importance in many developing countries. Availability of elite planting material is a limiting factor for increasing productivity of Theobroma cacao; therefore, the development of new strategies for clonal propagation is essential to improve farmers' incomes and to meet increasing global demand for cocoa. To develop a more efficient embryogenesis system for cacao, tissue was transformed with a transgene encoding a fusion of Leafy Cotyledon 2 (TcLEC2) to a glucocorticoid receptor domain (GR) to control nuclear localization of the protein. Upon application of the glucocorticoid dexamethasone (dex), downstream targets of LEC2 involved in seed-development were up-regulated and somatic embryos (SEs) were successfully regenerated from TcLEC2-GR transgenic flower and leaf tissue in large numbers. Immature SEs regenerated from TcLEC2-GR leaves were smaller in size than immature SEs from floral tissue, suggesting a different ontogenetic origin. Additionally, exposure of TcLEC2-GR floral explants to dex increased the number of SEs compared to floral explants from control, non-transgenic trees or from TcLEC2-GR floral explants not treated with dex. Testing different durations of exposure to dex indicated that a three-day treatment produced optimal embryo regeneration. Leaf derived SEs were successfully grown to maturity, converted into plants, and established in the greenhouse, demonstrating that these embryos are fully developmentally competent. In summary, we demonstrate that regulating TcLEC2 activity offers a powerful new strategy for optimizing somatic embryogenesis pipelines for cacao

    Transient Expression of CRISPR/Cas9 Machinery Targeting TcNPR3 Enhances Defense Response in Theobroma cacao

    No full text
    Theobroma cacao, the source of cocoa, suffers significant losses to a variety of pathogens resulting in reduced incomes for millions of farmers in developing countries. Development of disease resistant cacao varieties is an essential strategy to combat this threat, but is limited by sources of genetic resistance and the slow generation time of this tropical tree crop. In this study, we present the first application of genome editing technology in cacao, using Agrobacterium-mediated transient transformation to introduce CRISPR/Cas9 components into cacao leaves and cotyledon cells. As a first proof of concept, we targeted the cacao Non-Expressor of Pathogenesis-Related 3 (TcNPR3) gene, a suppressor of the defense response. After demonstrating activity of designed single-guide RNAs (sgRNA) in vitro, we used Agrobacterium to introduce a CRISPR/Cas9 system into leaf tissue, and identified the presence of deletions in 27% of TcNPR3 copies in the treated tissues. The edited tissue exhibited an increased resistance to infection with the cacao pathogen Phytophthora tropicalis and elevated expression of downstream defense genes. Analysis of off-target mutagenesis in sequences similar to sgRNA target sites using high-throughput sequencing did not reveal mutations above background sequencing error rates. These results confirm the function of NPR3 as a repressor of the cacao immune system and demonstrate the application of CRISPR/Cas9 as a powerful functional genomics tool for cacao. Several stably transformed and genome edited somatic embryos were obtained via Agrobacterium-mediated transformation, and ongoing work will test the effectiveness of this approach at a whole plant level

    Image1.TIF

    No full text
    <p>Theobroma cacao, the source of cocoa, suffers significant losses to a variety of pathogens resulting in reduced incomes for millions of farmers in developing countries. Development of disease resistant cacao varieties is an essential strategy to combat this threat, but is limited by sources of genetic resistance and the slow generation time of this tropical tree crop. In this study, we present the first application of genome editing technology in cacao, using Agrobacterium-mediated transient transformation to introduce CRISPR/Cas9 components into cacao leaves and cotyledon cells. As a first proof of concept, we targeted the cacao Non-Expressor of Pathogenesis-Related 3 (TcNPR3) gene, a suppressor of the defense response. After demonstrating activity of designed single-guide RNAs (sgRNA) in vitro, we used Agrobacterium to introduce a CRISPR/Cas9 system into leaf tissue, and identified the presence of deletions in 27% of TcNPR3 copies in the treated tissues. The edited tissue exhibited an increased resistance to infection with the cacao pathogen Phytophthora tropicalis and elevated expression of downstream defense genes. Analysis of off-target mutagenesis in sequences similar to sgRNA target sites using high-throughput sequencing did not reveal mutations above background sequencing error rates. These results confirm the function of NPR3 as a repressor of the cacao immune system and demonstrate the application of CRISPR/Cas9 as a powerful functional genomics tool for cacao. Several stably transformed and genome edited somatic embryos were obtained via Agrobacterium-mediated transformation, and ongoing work will test the effectiveness of this approach at a whole plant level.</p
    corecore