3,017 research outputs found

    Curated genome annotation of Oryza sativa ssp. japonica and comparative genome analysis with Arabidopsis thaliana

    Get PDF
    We present here the annotation of the complete genome of rice Oryza sativa L. ssp. japonica cultivar Nipponbare. All functional annotations for proteins and non-protein-coding RNA (npRNA) candidates were manually curated. Functions were identified or inferred in 19,969 (70%) of the proteins, and 131 possible npRNAs (including 58 antisense transcripts) were found. Almost 5000 annotated protein-coding genes were found to be disrupted in insertional mutant lines, which will accelerate future experimental validation of the annotations. The rice loci were determined by using cDNA sequences obtained from rice and other representative cereals. Our conservative estimate based on these loci and an extrapolation suggested that the gene number of rice is ~32,000, which is smaller than previous estimates. We conducted comparative analyses between rice and Arabidopsis thaliana and found that both genomes possessed several lineage-specific genes, which might account for the observed differences between these species, while they had similar sets of predicted functional domains among the protein sequences. A system to control translational efficiency seems to be conserved across large evolutionary distances. Moreover, the evolutionary process of protein-coding genes was examined. Our results suggest that natural selection may have played a role for duplicated genes in both species, so that duplication was suppressed or favored in a manner that depended on the function of a gene

    Germin and germin-like proteins: evolution, structure, and function

    Get PDF
    Germin and germin-like proteins (GLPs) are encoded by a family of genes found in all plants. They are part of the cupin superfamily of biochemically diverse proteins, a superfamily that has a conserved tertiary structure, though with limited similarity in primary sequence. The subgroups of GLPs have different enzyme functions that include the two hydrogen peroxide-generating enzymes, oxalate oxidase (OxO) and superoxide dismutase. This review summarizes the sequence and structural details of GLPs and also discusses their evolutionary progression, particularly their amplification in gene number during the evolution of the land plants. In terms of function, the GLPs are known to be differentially expressed during specific periods of plant growth and development, a pattern of evolutionary subfunctionalization. They are also implicated in the response of plants to biotic (viruses, bacteria, mycorrhizae, fungi, insects, nematodes, and parasitic plants) and abiotic (salt, heat/cold, drought, nutrient, and metal) stress. Most detailed data come from studies of fungal pathogenesis in cereals. This involvement with the protection of plants from environmental stress of various types has led to numerous plant breeding studies that have found links between GLPs and QTLs for disease and stress resistance. In addition the OxO enzyme has considerable commercial significance, based principally on its use in the medical diagnosis of oxalate concentration in plasma and urine. Finally, this review provides information on the nutritional importance of these proteins in the human diet, as several members are known to be allergenic, a feature related to their thermal stability and evolutionary connection to the seed storage proteins, also members of the cupin superfamily

    Global analysis of the sugarcane microtranscriptome reveals a unique composition of small RNAs associated with axillary bud outgrowth

    Get PDF
    Axillary bud outgrowth determines shoot architecture and is under the control of endogenous hormones and a fine-tuned gene-expression network, which probably includes small RNAs (sRNAs). Although it is well known that sRNAs act broadly in plant development, our understanding about their roles in vegetative bud outgrowth remains limited. Moreover, the expression profiles of microRNAs (miRNAs) and their targets within axillary buds are largely unknown. Here, we employed sRNA next-generation sequencing as well as computational and gene-expression analysis to identify and quantify sRNAs and their targets in vegetative axillary buds of the biofuel crop sugarcane (Saccharum spp.). Computational analysis allowed the identification of 26 conserved miRNA families and two putative novel miRNAs, as well as a number of trans-acting small interfering RNAs. sRNAs associated with transposable elements and protein-encoding genes were similarly represented in both inactive and developing bud libraries. Conversely, sequencing and quantitative reverse transcription-PCR results revealed that specific miRNAs were differentially expressed in developing buds, and some correlated negatively with the expression of their targets at specific stages of axillary bud development. For instance, the expression patterns of miR159 and its target GAMYB suggested that they may play roles in regulating abscisic acid-signalling pathways during sugarcane bud outgrowth. Our work reveals, for the first time, differences in the composition and expression profiles of diverse sRNAs and targets between inactive and developing vegetative buds that, together with the endogenous balance of specific hormones, may be important in regulating axillary bud outgrowth

    A full-length enriched cDNA library and expressed sequence tag analysis of the parasitic weed, Striga hermonthica

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The obligate parasitic plant witchweed (<it>Striga hermonthica</it>) infects major cereal crops such as sorghum, maize, and millet, and is the most devastating weed pest in Africa. An understanding of the nature of its parasitism would contribute to the development of more sophisticated management methods. However, the molecular and genomic resources currently available for the study of <it>S. hermonthica </it>are limited.</p> <p>Results</p> <p>We constructed a full-length enriched cDNA library of <it>S. hermonthica</it>, sequenced 37,710 clones from the library, and obtained 67,814 expressed sequence tag (EST) sequences. The ESTs were assembled into 17,317 unigenes that included 10,319 contigs and 6,818 singletons. The <it>S. hermonthica </it>unigene dataset was subjected to a comparative analysis with other plant genomes or ESTs. Approximately 80% of the unigenes have homologs in other dicotyledonous plants including <it>Arabidopsis</it>, poplar, and grape. We found that 589 unigenes are conserved in the hemiparasitic <it>Triphysaria </it>species but not in other plant species. These are good candidates for genes specifically involved in plant parasitism. Furthermore, we found 1,445 putative simple sequence repeats (SSRs) in the <it>S. hermonthica </it>unigene dataset. We tested 64 pairs of PCR primers flanking the SSRs to develop genetic markers for the detection of polymorphisms. Most primer sets amplified polymorphicbands from individual plants collected at a single location, indicating high genetic diversity in <it>S. hermonthica</it>. We selected 10 primer pairs to analyze <it>S. hermonthica </it>harvested in the field from different host species and geographic locations. A clustering analysis suggests that genetic distances are not correlated with host specificity.</p> <p>Conclusions</p> <p>Our data provide the first extensive set of molecular resources for studying <it>S. hermonthica</it>, and include EST sequences, a comparative analysis with other plant genomes, and useful genetic markers. All the data are stored in a web-based database and freely available. These resources will be useful for genome annotation, gene discovery, functional analysis, molecular breeding, epidemiological studies, and studies of plant evolution.</p

    Identification of stress-responsive genes in an indica rice (Oryza sativa L.) using ESTs generated from drought-stressed seedlings

    Get PDF
    The impacts of drought on plant growth and development limit cereal crop production worldwide. Rice (Oryza sativa) productivity and production is severely affected due to recurrent droughts in almost all agroecological zones. With the advent of molecular and genomic technologies, emphasis is now placed on understanding the mechanisms of genetic control of the drought-stress response. In order to identify genes associated with water-stress response in rice, ESTs generated from a normalized cDNA library, constructed from drought-stressed leaf tissue of an indica cultivar, Nagina 22 were used. Analysis of 7794 cDNA sequences led to the identification of 5815 rice ESTs. Of these, 334 exhibited no significant sequence homology with any rice ESTs or full-length cDNAs in public databases, indicating that these transcripts are enriched during drought stress. Analysis of these 5815 ESTs led to the identification of 1677 unique sequences. To characterize this drought transcriptome further and to identify candidate genes associated with the drought-stress response, the rice data were compared with those for abiotic stress-induced sequences obtained from expression profiling studies in Arabidopsis, barley, maize, and rice. This comparative analysis identified 589 putative stress-responsive genes (SRGs) that are shared by these diverse plant species. Further, the identified leaf SRGs were compared to expression profiles for a drought-stressed rice panicle library to identify common sequences. Significantly, 125 genes were found to be expressed under drought stress in both tissues. The functional classification of these 125 genes showed that a majority of them are associated with cellular metabolism, signal transduction, and transcriptional regulation

    BIOINFORMATICS TOOL DEVELOPMENT AND SEQUENCE ANALYSIS OF ROSACEAE FAMILY EXPRESSED SEQUENCE TAGS

    Get PDF
    BACKGROUND: An international community of researchers has generated a significant number of Expressed Sequence Tags (ESTs) for the Rosaceae, an economically important plant family that includes most temperate fruits such as apple, cherry, peach, and strawberry as well as other commercially valuable members. ESTs are fragments of expressed genes that can be used for gene discovery, developing markers for mapping and cultivar improvement via marker assisted selection. Efficient dissemination and integration of this data is best facilitated through a centralized and curated database with associated sequence analysis tools. DESCRIPTION: The Genome Database for Rosaceae (GDR) was initiated to provide a curated and integrated web-based relational database for this family. I developed a key component of GDR to assemble and annotate the publicly available ESTs from the four main genera of the family (Prunus, Malus, Fragaria, Rosa). I created both genera and family level unigenes using the software CAP3 after extensive filtering, trimming and assembly. Further analysis includes marker mining for single nucleotide polymorphisms (SNPs) and simple sequence repeast (SSRs) with putative primer identification, and oligo identification for potential microarray development. Functional genomics efforts are supported with sequence similarity searching against major protein and nucleotide databases, gene product ontology assignment, and protein motif identification. I deployed the entire project on the GDR with all data available for browsing, searching, and downloading. CONCLUSIONS: The GDR and its associated EST unigene project are meeting a major need for timely annotation and curation of sequence data for the Rosaceae community. The results of my analysis highlight major genes and pathways of interest including ripening, disease resistance, and transcription factors. The easily accessible pool of annotated coding sequences should further both functional and structural genomics characterization in Rosaceae. The unigene elucidates the levels of sequence similarity shared across different plant species and the implications for resource sharing across the family. GDR can be accessed at http://www.rosaceae.org/

    Comparative analyses of six solanaceous transcriptomes reveal a high degree of sequence conservation and species-specific transcripts

    Get PDF
    BACKGROUND: The Solanaceae is a family of closely related species with diverse phenotypes that have been exploited for agronomic purposes. Previous studies involving a small number of genes suggested sequence conservation across the Solanaceae. The availability of large collections of Expressed Sequence Tags (ESTs) for the Solanaceae now provides the opportunity to assess sequence conservation and divergence on a genomic scale. RESULTS: All available ESTs and Expressed Transcripts (ETs), 449,224 sequences for six Solanaceae species (potato, tomato, pepper, petunia, tobacco and Nicotiana benthamiana), were clustered and assembled into gene indices. Examination of gene ontologies revealed that the transcripts within the gene indices encode a similar suite of biological processes. Although the ESTs and ETs were derived from a variety of tissues, 55–81% of the sequences had significant similarity at the nucleotide level with sequences among the six species. Putative orthologs could be identified for 28–58% of the sequences. This high degree of sequence conservation was supported by expression profiling using heterologous hybridizations to potato cDNA arrays that showed similar expression patterns in mature leaves for all six solanaceous species. 16–19% of the transcripts within the six Solanaceae gene indices did not have matches among Solanaceae, Arabidopsis, rice or 21 other plant gene indices. CONCLUSION: Results from this genome scale analysis confirmed a high level of sequence conservation at the nucleotide level of the coding sequence among Solanaceae. Additionally, the results indicated that part of the Solanaceae transcriptome is likely to be unique for each species

    Cell-type specific analysis of translating RNAs in developing flowers reveals new levels of control

    Get PDF
    Determining both the expression levels of mRNA and the regulation of its translation is important in understanding specialized cell functions. In this study, we describe both the expression profiles of cells within spatiotemporal domains of the Arabidopsis thaliana flower and the post-transcriptional regulation of these mRNAs, at nucleotide resolution. We express a tagged ribosomal protein under the promoters of three master regulators of flower development. By precipitating tagged polysomes, we isolated cell type specific mRNAs that are probably translating, and quantified those mRNAs through deep sequencing. Cell type comparisons identified known cell-specific transcripts and uncovered many new ones, from which we inferred cell type-specific hormone responses, promoter motifs and coexpressed cognate binding factor candidates, and splicing isoforms. By comparing translating mRNAs with steady-state overall transcripts, we found evidence for widespread post-transcriptional regulation at both the intron splicing and translational stages. Sequence analyses identified structural features associated with each step. Finally, we identified a new class of noncoding RNAs associated with polysomes. Findings from our profiling lead to new hypotheses in the understanding of flower development

    Floral gene resources from basal angiosperms for comparative genomics research

    Get PDF
    BACKGROUND: The Floral Genome Project was initiated to bridge the genomic gap between the most broadly studied plant model systems. Arabidopsis and rice, although now completely sequenced and under intensive comparative genomic investigation, are separated by at least 125 million years of evolutionary time, and cannot in isolation provide a comprehensive perspective on structural and functional aspects of flowering plant genome dynamics. Here we discuss new genomic resources available to the scientific community, comprising cDNA libraries and Expressed Sequence Tag (EST) sequences for a suite of phylogenetically basal angiosperms specifically selected to bridge the evolutionary gaps between model plants and provide insights into gene content and genome structure in the earliest flowering plants. RESULTS: Random sequencing of cDNAs from representatives of phylogenetically important eudicot, non-grass monocot, and gymnosperm lineages has so far (as of 12/1/04) generated 70,514 ESTs and 48,170 assembled unigenes. Efficient sorting of EST sequences into putative gene families based on whole Arabidopsis/rice proteome comparison has permitted ready identification of cDNA clones for finished sequencing. Preliminarily, (i) proportions of functional categories among sequenced floral genes seem representative of the entire Arabidopsis transcriptome, (ii) many known floral gene homologues have been captured, and (iii) phylogenetic analyses of ESTs are providing new insights into the process of gene family evolution in relation to the origin and diversification of the angiosperms. CONCLUSION: Initial comparisons illustrate the utility of the EST data sets toward discovery of the basic floral transcriptome. These first findings also afford the opportunity to address a number of conspicuous evolutionary genomic questions, including reproductive organ transcriptome overlap between angiosperms and gymnosperms, genome-wide duplication history, lineage-specific gene duplication and functional divergence, and analyses of adaptive molecular evolution. Since not all genes in the floral transcriptome will be associated with flowering, these EST resources will also be of interest to plant scientists working on other functions, such as photosynthesis, signal transduction, and metabolic pathways

    Structural and functional analysis of rice genome

    Get PDF
    Rice is an excellent system for plant genomics as it represents a modest size genome of 430 Mb. It feeds more than half the population of the world. Draft sequences of the rice genome, derived by whole-genome shotgun approach at relatively low coverage (4-6 X), were published and the International Rice Genome Sequencing Project (IRGSP) declared high quality (>10 X), genetically anchored, phase 2 level sequence in 2002. In addition, phase 3 level finished sequence of chromosomes 1, 4 and 10 (out of 12 chromosomes of rice) has already been reported by scientists from IRGSP consortium. Various estimates of genes in rice place the number at >50,000. Already, over 28,000 full-length cDNAs have been sequenced, most of which map to genetically anchored genome sequence. Such information is very useful in revealing novel features of macroand micro-level synteny of rice genome with other cereals. Microarray analysis is unraveling the identity of rice genes expressing in temporal and spatial manner and should help target candidate genes useful for improving traits of agronomic importance. Simultaneously, functional analysis of rice genome has been initiated by marker-based characterization of useful genes and employing functional knock-outs created by mutation or gene tagging. Integration of this enormous information is expected to catalyze tremendous activity on basic and applied aspects of rice genomics
    corecore